Move TorchDynamo into PyTorch core (#86461)

Context:
https://github.com/pytorch/torchdynamo/issues/1588

This PR moves [TorchDynamo](https://github.com/pytorch/torchdynamo) and TorchInductor into PyTorch core.
- `torchdynamo` becomes `torch._dynamo`
- `torchinductor` becomes `torch._inductor`

This PR was generated by running `copy_to_core.sh` in https://github.com/pytorch/torchdynamo/pull/1538

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86461
Approved by: https://github.com/voznesenskym
This commit is contained in:
Jason Ansel
2022-10-13 23:18:06 +00:00
committed by PyTorch MergeBot
parent 97abc21f2b
commit c7c09722ad
308 changed files with 85171 additions and 14 deletions

View File

@ -36,7 +36,7 @@ popd
=======
:: Pin unittest-xml-reporting to freeze printing test summary logic, related: https://github.com/pytorch/pytorch/issues/69014
pip install "ninja==1.10.0.post1" future "hypothesis==5.35.1" "expecttest==0.1.3" "librosa>=0.6.2" "scipy==1.6.3" psutil pillow "unittest-xml-reporting<=3.2.0,>=2.0.0" pytest pytest-xdist pytest-shard pytest-rerunfailures "xdoctest==1.0.2" "pygments==2.12.0" "opt-einsum>=3.3"
pip install "ninja==1.10.0.post1" future "hypothesis==5.35.1" "expecttest==0.1.3" "librosa>=0.6.2" "scipy==1.6.3" psutil pillow "unittest-xml-reporting<=3.2.0,>=2.0.0" pytest pytest-xdist pytest-shard pytest-rerunfailures sympy "xdoctest==1.0.2" "pygments==2.12.0" "opt-einsum>=3.3"
if errorlevel 1 exit /b
if not errorlevel 0 exit /b

View File

@ -0,0 +1,50 @@
# Torchdynamo Benchmarks
## What We Benchmark
TorchDynamo provides a benchmark harness that takes care of uniformly benchmarking different models. It interleaves runs of eager and dynamo to avoid machine noise/variability issues, and reports results based on medians along with P-values.
The runner integrates with models from TorchBenchmark, HuggingFace and TIMM suites and covers both training and inference.
The infrastructure allows us to specify a loss function. For torchbench models, we use .sum().backward() call in place of the native loss function. For TIMM models, we use a CrossEntropy loss. And HF models contain a loss function inside the model itself, so we don't need any special loss computation handling.
Training benchmarks approximate training by running the model forward, computing loss and then running backward. We entirely skip the optimizer step today.
Inference benchmarks and Training benchmarks measure correctness by comparing dynamo and eager model outputs given fixed inputs and seeds.
## Setup
### Machine
We run benchmarks on AWS machines (p4d.24xlarge) using 8xNVidia A100 40GB cards. We suggest using Cuda 11.6 for consistency.
### Benchmarks
Make sure to carefully follow the [torchbench installation](https://github.com/pytorch/benchmark#installation) instructions, taking care to build the auxiliary libraries (torchvision, torchtext) from a matching version to your pytorch version.
For HF and TIMM models, the scripts already install the transformers and timm package respectively on the first run.
## Runbook
### Basic Usage
There are a lot of flags in the benchmark runner, and it can be confusing to know which settings to use or what machine to run it on. In order to support apples-to-apples comparison, we have provided the following 'standard' settings in `runner.py`. This script is a wrapper over the common benchmarking infrastructure and simplifies the flags. We will continually update `runner.py` with the latest and most relevant compilers for training and inference. It also provides some graph utilities to visualize and compare results. Some of the example commands are
**Inference Commands**
* Inference compilers on torchbench models - `python benchmarks/runner.py --suites=torchbench --inference --dtypes=float16`
**Training Commands**
* Training compilers on TIMM models - `python benchmarks/runner.py --suites=timm_models --training --dtypes=float32 --output-dir=timm_logs`
* AOTAutograd Training compiler on TIMM models - `python benchmarks/runner.py --suites=timm_models --training --dtypes=float32 --compilers=aot_nvfuser --output-dir=timm_logs`
Running runner.py generates a file named `run.sh`. This file contains the actual commands that invoke the common benchmarking infrastructure with the appropriate flags. Which brings us to the advanced usage.
### Advanced Usage
One could directly call `torchbench.py`, `huggingface.py` or `timm_models.py` with the necessary flags. There are a lot of flags in the benchmarks runner. Some of the examples are as follows. These are subject to change.
**Inference Commands**
* TorchScript NVFuser Inference - `python benchmarks/torchbench.py -dcuda -n100 --speedup-ts`
* TorchInductor CUDA Graphs Inference - `python benchmarks/torchbench.py -dcuda --inductor-settings --float32 -n50 --inductor`
**Training Commands**
* Torchscript (with TorchDynamo capture) NVFuser Training - `python benchmarks/torchbench.py --float32 -dcuda --training --nvfuser --speedup-dynamo-ts --use-eval-mode`
* AOTAutograd Torchscript NVFuser Training - `python benchmarks/torchbench.py --float32 -dcuda --training --nvfuser --accuracy-aot-ts-mincut --use-eval-mode`
Above commands are for torchbench models. You can simply replace `torchbench.py` with `huggingface.py` for HF models, and `timm_model.py` for TIMM models.

View File

2021
benchmarks/dynamo/common.py Normal file

File diff suppressed because it is too large Load Diff

543
benchmarks/dynamo/huggingface.py Executable file
View File

@ -0,0 +1,543 @@
#!/usr/bin/env python3
import importlib
import logging
import os
import re
import subprocess
import sys
import warnings
import torch
from common import BenchmarkRunner, main
from torch._dynamo.testing import collect_results
from torch._dynamo.utils import clone_inputs
log = logging.getLogger(__name__)
def pip_install(package):
subprocess.check_call([sys.executable, "-m", "pip", "install", package])
# Disable the flake warnings for the imports. Flake8 does not provide a way to
# disable just warning for the entire file. Disabling flake8 entirely.
# flake8: noqa
imports = [
"AlbertForPreTraining",
"AutoConfig",
"AutoModelForCausalLM",
"AutoModelForMaskedLM",
"AutoModelForSeq2SeqLM",
"BigBirdConfig",
"BlenderbotForConditionalGeneration",
"BlenderbotModel",
"BlenderbotSmallForConditionalGeneration",
"BlenderbotSmallModel",
"CLIPModel",
"CLIPVisionModel",
"ElectraForPreTraining",
"GPT2ForSequenceClassification",
"GPTJForSequenceClassification",
"GPTNeoForSequenceClassification",
"HubertForSequenceClassification",
"LxmertForPreTraining",
"LxmertForQuestionAnswering",
"MarianForCausalLM",
"MarianModel",
"MarianMTModel",
"PegasusForConditionalGeneration",
"PegasusModel",
"ReformerConfig",
"ViTForImageClassification",
"ViTForMaskedImageModeling",
"ViTModel",
]
try:
mod = importlib.import_module("transformers")
for cls in imports:
if not hasattr(mod, cls):
raise ModuleNotFoundError
except ModuleNotFoundError:
print("Installing HuggingFace Transformers...")
pip_install("git+https://github.com/huggingface/transformers.git#egg=transformers")
finally:
for cls in imports:
exec(f"from transformers import {cls}")
USE_HALF_BATCH_SIZE = True
# These models contain the models present in huggingface_models_list. It is a
# combination of models supported by HF Fx parser and some manually supplied
# models. For these models, we already know the largest batch size that can fit
# on A100 GPUs - 40 GB.
BATCH_SIZE_KNOWN_MODELS = dict()
# Get the list of models and their batch sizes
MODELS_FILENAME = "huggingface_models_list.txt"
if os.path.exists("benchmarks"):
MODELS_FILENAME = os.path.join("benchmarks", MODELS_FILENAME)
assert os.path.exists(MODELS_FILENAME)
with open(MODELS_FILENAME, "r") as fh:
lines = fh.readlines()
lines = [line.rstrip() for line in lines]
for line in lines:
model_name, batch_size = line.split(",")
batch_size = int(batch_size)
BATCH_SIZE_KNOWN_MODELS[model_name] = batch_size
assert len(BATCH_SIZE_KNOWN_MODELS)
SKIP = {
# Difficult to run and compare
"Reformer",
# Fails deepcopy
"BlenderbotForCausalLM",
"BlenderbotForConditionalGeneration",
"GPTJForCausalLM",
"GPTJForQuestionAnswering",
"GPTNeoForCausalLM",
"GPTNeoForSequenceClassification",
# Fails with even batch size = 1
"DebertaV2ForMaskedLM",
"DebertaV2ForQuestionAnswering",
}
# TODO - Fails even after fake tensors
USE_SMALL_BATCH_SIZE = {
"AlbertForMaskedLM": 2,
"AlbertForPreTraining": 4,
"AlbertForQuestionAnswering": 2,
"BartForCausalLM": 2,
"BartForConditionalGeneration": 1,
"BlenderbotSmallForConditionalGeneration": 32,
"DebertaForMaskedLM": 4,
"DebertaForQuestionAnswering": 4,
"DebertaV2ForMaskedLM": 1,
"DebertaV2ForQuestionAnswering": 1,
"DistilBertForMaskedLM": 16,
"ElectraForCausalLM": 1,
"GPTNeoForCausalLM": 1,
"GPTNeoForSequenceClassification": 1,
"M2M100ForConditionalGeneration": 2,
"MT5ForConditionalGeneration": 2,
"MegatronBertForCausalLM": 2,
"OPTForCausalLM": 4,
"PegasusForCausalLM": 8,
"PegasusForConditionalGeneration": 4,
"RobertaForCausalLM": 4,
"TrOCRForCausalLM": 8,
"XGLMForCausalLM": 1,
"XLNetLMHeadModel": 4,
}
def get_module_cls_by_model_name(model_cls_name):
_module_by_model_name = {
"Speech2Text2Decoder": "transformers.models.speech_to_text_2.modeling_speech_to_text_2",
"TrOCRDecoder": "transformers.models.trocr.modeling_trocr",
}
module_name = _module_by_model_name.get(model_cls_name, "transformers")
module = importlib.import_module(module_name)
return getattr(module, model_cls_name)
def get_sequence_length(model_cls, model_name):
if model_name.startswith(("Bert", "Roberta", "Blenderbot")):
seq_length = 128
elif model_name.startswith(("GPT2", "Bart", "T5")):
seq_length = 1024
elif model_name in ("AllenaiLongformerBase", "BigBird"):
seq_length = 1024
elif "Reformer" in model_name:
seq_length = 4096
elif model_name.startswith(
("Albert", "Deberta", "Layout", "Electra", "XLNet")
) or model_name in ("DistillGPT2", "GoogleFnet", "YituTechConvBert", "CamemBert"):
seq_length = 512
else:
log.warning(
f"Sequence Length not defined for {model_name}. Choosing 128 arbitrarily"
)
seq_length = 128
return seq_length
def generate_inputs_for_model(
model_cls, model, model_name, bs, device, include_loss_args=False
):
# TODO - Check if following values are representative
num_choices = 3
num_visual_features = 42
seq_length = get_sequence_length(model_cls, model_name)
vocab_size = model.config.vocab_size
if model_name.endswith("MultipleChoice"):
input = rand_int_tensor(device, 0, vocab_size, (bs, num_choices, seq_length))
elif model_name.startswith("Roberta"):
input = rand_int_tensor(device, 0, 1, (bs, seq_length))
else:
input = rand_int_tensor(device, 0, vocab_size, (bs, seq_length))
if "Bart" in model_name:
input[:, -1] = model.config.eos_token_id
input_dict = {"input_ids": input}
if (
model_name.startswith("T5")
or model_name.startswith("M2M100")
or model_name.startswith("MT5")
or model_cls
in [
BlenderbotModel,
BlenderbotSmallModel,
BlenderbotForConditionalGeneration,
BlenderbotSmallForConditionalGeneration,
PegasusModel,
PegasusForConditionalGeneration,
MarianModel,
MarianMTModel,
]
):
input_dict["decoder_input_ids"] = input
if model_name.startswith("Lxmert"):
visual_feat_dim, visual_pos_dim = (
model.config.visual_feat_dim,
model.config.visual_pos_dim,
)
input_dict["visual_feats"] = torch.randn(
bs, num_visual_features, visual_feat_dim
)
input_dict["visual_pos"] = torch.randn(bs, num_visual_features, visual_pos_dim)
if include_loss_args:
if model_name.endswith("PreTraining"):
if model_cls in [ElectraForPreTraining, LxmertForPreTraining]:
input_dict["labels"] = rand_int_tensor(device, 0, 1, (bs, seq_length))
else:
label_name = (
"sentence_order_label"
if model_cls in [AlbertForPreTraining]
else "next_sentence_label"
)
input_dict["labels"] = (
rand_int_tensor(device, 0, vocab_size, (bs, seq_length)),
)
input_dict[label_name] = rand_int_tensor(device, 0, 1, (bs,))
elif model_name.endswith("QuestionAnswering"):
input_dict["start_positions"] = rand_int_tensor(
device, 0, seq_length, (bs,)
)
input_dict["end_positions"] = rand_int_tensor(device, 0, seq_length, (bs,))
elif (
model_name.endswith("MaskedLM")
or model_name.endswith("HeadModel")
or model_name.endswith("CausalLM")
or model_name.endswith("DoubleHeadsModel")
):
input_dict["labels"] = rand_int_tensor(
device, 0, vocab_size, (bs, seq_length)
)
elif model_name.endswith("TokenClassification"):
input_dict["labels"] = rand_int_tensor(
device, 0, model.config.num_labels - 1, (bs, seq_length)
)
elif model_name.endswith("MultipleChoice"):
input_dict["labels"] = rand_int_tensor(device, 0, num_choices, (bs,))
elif model_name.endswith("SequenceClassification"):
input_dict["labels"] = rand_int_tensor(
device, 0, model.config.num_labels - 1, (bs,)
)
elif model_name.endswith("NextSentencePrediction"):
input_dict["labels"] = rand_int_tensor(device, 0, 1, (bs,))
elif model_name.endswith("ForConditionalGeneration"):
input_dict["labels"] = rand_int_tensor(
device, 0, vocab_size - 1, (bs, seq_length)
)
elif model_name in EXTRA_MODELS:
input_dict["labels"] = rand_int_tensor(
device, 0, vocab_size, (bs, seq_length)
)
else:
raise NotImplementedError(
f"Class {model_name} unsupported for training test "
)
return input_dict
def rand_int_tensor(device, low, high, shape):
return torch.randint(
low,
high,
shape,
device=device,
dtype=torch.int64,
requires_grad=False,
)
EXTRA_MODELS = {
"AllenaiLongformerBase": (
AutoConfig.from_pretrained("allenai/longformer-base-4096"),
AutoModelForMaskedLM,
),
"Reformer": (
ReformerConfig(),
AutoModelForMaskedLM,
),
"T5Small": (
AutoConfig.from_pretrained("t5-small"),
AutoModelForSeq2SeqLM,
),
"BigBird": (
BigBirdConfig(attention_type="block_sparse"),
AutoModelForMaskedLM,
),
"DistillGPT2": (
AutoConfig.from_pretrained("distilgpt2"),
AutoModelForCausalLM,
),
"GoogleFnet": (
AutoConfig.from_pretrained("google/fnet-base"),
AutoModelForMaskedLM,
),
"YituTechConvBert": (
AutoConfig.from_pretrained("YituTech/conv-bert-base"),
AutoModelForMaskedLM,
),
"CamemBert": (
AutoConfig.from_pretrained("camembert-base"),
AutoModelForMaskedLM,
),
}
class HuggingfaceRunner(BenchmarkRunner):
def __init__(self):
super(HuggingfaceRunner, self).__init__()
self.suite_name = "huggingface"
def load_model(
self,
device,
model_name,
batch_size=None,
):
is_training = self.args.training
use_eval_mode = self.args.use_eval_mode
dtype = torch.float32
if model_name not in EXTRA_MODELS:
model_cls = get_module_cls_by_model_name(model_name)
config_cls = model_cls.config_class
config = config_cls()
# NB: some models need a pad token defined to handle BS > 1
if (
model_cls
in [
GPT2ForSequenceClassification,
GPTNeoForSequenceClassification,
GPTJForSequenceClassification,
]
or model_cls.__name__.startswith("Roberta")
or model_cls.__name__.startswith("Marian")
):
config.pad_token_id = 0
else:
config, model_cls = EXTRA_MODELS[model_name]
if "auto" in model_cls.__module__:
# Handle auto classes
model = model_cls.from_config(config).to(device, dtype=dtype)
else:
model = model_cls(config).to(device, dtype=dtype)
if model_name in BATCH_SIZE_KNOWN_MODELS:
batch_size_default = BATCH_SIZE_KNOWN_MODELS[model_name]
elif batch_size is None:
batch_size_default = 16
log.warning(
"Batch size not specified for {model_name}. Setting batch_size=16"
)
if batch_size is None:
batch_size = batch_size_default
if model_name in USE_SMALL_BATCH_SIZE:
batch_size = USE_SMALL_BATCH_SIZE[model_name]
log.warning(
f"Running smaller batch size={batch_size} for {model_name}, orig batch_size={batch_size_default}"
)
elif USE_HALF_BATCH_SIZE and batch_size >= 2:
batch_size = int(batch_size / 2)
log.warning(
f"Running smaller batch size={batch_size} for {model_name}, orig batch_size={batch_size_default}"
)
example_inputs = generate_inputs_for_model(
model_cls, model, model_name, batch_size, device, include_loss_args=True
)
# So we can check for correct gradients without eliminating the dropout computation
for attr in dir(config):
if "drop" in attr and isinstance(getattr(config, attr), float):
setattr(config, attr, 1e-30)
if is_training and not use_eval_mode:
model.train()
else:
model.eval()
self.init_optimizer(device, model.parameters())
self.validate_model(model, example_inputs)
return device, model_name, model, example_inputs, batch_size
def iter_model_names(self, args):
model_names = list(BATCH_SIZE_KNOWN_MODELS.keys()) + list(EXTRA_MODELS.keys())
model_names = set(model_names)
model_names = sorted(model_names)
start, end = self.get_benchmark_indices(len(model_names))
for index, model_name in enumerate(model_names):
if index < start or index >= end:
continue
if (
not re.search("|".join(args.filter), model_name, re.I)
or re.search("|".join(args.exclude), model_name, re.I)
or model_name in SKIP
):
continue
yield model_name
def pick_grad(self, name, is_training):
if is_training:
return torch.enable_grad()
else:
return torch.no_grad()
def get_tolerance_and_cosine_flag(self, is_training, current_device, name):
cosine = self.args.cosine
if is_training:
return 1e-2, cosine
return 1e-3, cosine
def compute_loss(self, pred):
return pred[0]
def forward_pass(self, mod, inputs, collect_outputs=True):
return mod(**inputs)
def forward_and_backward_pass(self, mod, inputs, collect_outputs=True):
cloned_inputs = clone_inputs(inputs)
mod.zero_grad(True)
with self.autocast():
pred = mod(**cloned_inputs)
loss = self.compute_loss(pred)
self.grad_scaler.scale(loss).backward()
self.optimizer_step()
if collect_outputs:
return collect_results(mod, pred, loss, cloned_inputs)
return None
def refresh_model_names_and_batch_sizes():
"""
This function reads the HF Fx tracer supported models and finds the largest
batch size that could fit on the GPU with PyTorch eager.
The resulting data is written in huggingface_models_list.txt.
Note - We only need to run this function if we believe that HF Fx tracer now
supports more models.
"""
import transformers.utils.fx as hf_fx
family = dict()
lm_seen = set()
family_seen = set()
for cls_name in hf_fx._SUPPORTED_MODELS:
if "For" not in cls_name:
continue
model_cls = get_module_cls_by_model_name(cls_name)
# TODO: AttributeError: '*Config' object has no attribute 'vocab_size'
if model_cls in [
CLIPModel,
CLIPVisionModel,
SwinForImageClassification,
SwinForImageClassification,
SwinForMaskedImageModeling,
SwinModel,
ViTForImageClassification,
ViTForMaskedImageModeling,
ViTModel,
]:
continue
# TODO: AssertionError: Padding_idx must be within num_embeddings
if model_cls in [MarianForCausalLM, MarianMTModel, MarianModel]:
continue
# TODO: "model is not supported yet" from HFTracer
if model_cls in [HubertForSequenceClassification]:
continue
# TODO: shape mismatch in loss calculation
if model_cls in [LxmertForQuestionAnswering]:
continue
family_name = cls_name.split("For")[0]
if family_name not in family:
family[family_name] = []
if cls_name.endswith(("MaskedLM", "CausalLM")) and family_name not in lm_seen:
family[family_name].append(cls_name)
lm_seen.add(family_name)
elif (
cls_name.endswith(
("SequenceClassification", "ConditionalGeneration", "QuestionAnswering")
)
and family_name not in family_seen
):
family[family_name].append(cls_name)
family_seen.add(family_name)
elif cls_name.endswith("ImageClassification"):
family[family_name].append(cls_name)
chosen_models = set()
for members in family.values():
chosen_models.update(set(members))
# Add the EXTRA_MODELS
chosen_models.update(set(EXTRA_MODELS.keys()))
for model_name in sorted(chosen_models):
try:
subprocess.check_call(
[sys.executable]
+ sys.argv
+ ["--find-batch-sizes"]
+ [f"--only={model_name}"]
+ [f"--output={MODELS_FILENAME}"]
)
except subprocess.SubprocessError:
log.warning(f"Failed to find suitable batch size for {model_name}")
if __name__ == "__main__":
# Code to refresh model names and batch sizes
# if "--find-batch-sizes" not in sys.argv:
# refresh_model_names_and_batch_sizes()
logging.basicConfig(level=logging.WARNING)
warnings.filterwarnings("ignore")
main(HuggingfaceRunner())

View File

@ -0,0 +1,53 @@
AlbertForMaskedLM,8
AlbertForQuestionAnswering,8
AllenaiLongformerBase,1
BartForCausalLM,16
BartForConditionalGeneration,4
BertForMaskedLM,128
BertForQuestionAnswering,128
BigBird,1
BlenderbotForCausalLM,32
BlenderbotForConditionalGeneration,32
BlenderbotSmallForCausalLM,128
BlenderbotSmallForConditionalGeneration,128
CamemBert,1
DebertaForMaskedLM,32
DebertaForQuestionAnswering,32
DebertaV2ForMaskedLM,8
DebertaV2ForQuestionAnswering,8
DistilBertForMaskedLM,64
DistilBertForQuestionAnswering,64
DistillGPT2,1
ElectraForCausalLM,64
ElectraForQuestionAnswering,128
GPT2ForSequenceClassification,8
GPTJForCausalLM,1
GPTJForQuestionAnswering,1
GPTNeoForCausalLM,8
GPTNeoForSequenceClassification,8
GoogleFnet,1
LayoutLMForMaskedLM,32
LayoutLMForSequenceClassification,32
M2M100ForConditionalGeneration,8
MBartForCausalLM,32
MBartForConditionalGeneration,16
MT5ForConditionalGeneration,8
MegatronBertForCausalLM,16
MegatronBertForQuestionAnswering,16
MobileBertForMaskedLM,32
MobileBertForQuestionAnswering,64
OPTForCausalLM,32
PLBartForCausalLM,32
PLBartForConditionalGeneration,16
PegasusForCausalLM,32
PegasusForConditionalGeneration,16
Reformer,1
RobertaForCausalLM,128
RobertaForQuestionAnswering,128
Speech2Text2ForCausalLM,128
T5ForConditionalGeneration,8
T5Small,1
TrOCRForCausalLM,32
XGLMForCausalLM,8
XLNetLMHeadModel,128
YituTechConvBert,1

View File

@ -0,0 +1,170 @@
import model
import torch
import torch._dynamo
import torch._inductor
import torch._inductor.config as config
import torch._inductor.triton_ops
import triton
# The flag below controls whether to allow TF32 on matmul. This flag defaults to True.
torch.backends.cuda.matmul.allow_tf32 = True
# The flag below controls whether to allow TF32 on cuDNN. This flag defaults to True.
torch.backends.cudnn.allow_tf32 = True
# config.debug = True
config.triton.convolution = "autotune"
# conv benchmarks
conv_confs = [
triton.testing.Benchmark(
x_names=["layout"],
x_vals=["nchw", "nhwc"],
line_arg="provider",
line_vals=["aten", "autotune", "triton_conv", "triton_conv1x1"],
line_names=["aten", "autotune", "triton_conv", "triton_conv1x1"],
ylabel="TFLOPS",
plot_name=f"resnet50-conv{i}-perf",
args={
"BATCH": BATCH,
"IN_H": IN_H,
"IN_W": IN_W,
"IN_C": IN_C,
"KERNEL_N": KERNEL_N,
"KERNEL_H": KERNEL_H,
"KERNEL_W": KERNEL_W,
"stride": stride,
"padding": padding,
},
)
for i, (
IN_H,
IN_W,
IN_C,
KERNEL_H,
KERNEL_W,
KERNEL_N,
stride,
padding,
) in enumerate(model.resnet50_layers)
for BATCH in [32]
]
@triton.testing.perf_report(conv_confs)
def bench_op(
# Tensor dimensions
BATCH,
IN_C,
IN_H,
IN_W,
KERNEL_N,
KERNEL_H,
KERNEL_W,
# provider
provider,
# parameters of conv
stride=(1, 1),
padding=(0, 0),
dilation=(1, 1),
groups=1,
dtype=torch.float32,
layout="nhwc",
warmup=25,
rep=75,
):
skip = False
# allocate inputs, nchw
x = torch.randn((BATCH, IN_C, IN_H, IN_W), dtype=dtype, device="cuda")
w = torch.randn(
(KERNEL_N, IN_C // groups, KERNEL_H, KERNEL_W), dtype=dtype, device="cuda"
)
bias = torch.randn((KERNEL_N), dtype=dtype, device="cuda")
if layout == "nhwc":
x = x.to(memory_format=torch.channels_last)
w = w.to(memory_format=torch.channels_last)
OUT_H = (
IN_H + 2 * padding[0] - dilation[0] * (KERNEL_H - 1) - 1 + stride[0]
) // stride[0]
OUT_W = (
IN_W + 2 * padding[1] - dilation[1] * (KERNEL_W - 1) - 1 + stride[1]
) // stride[1]
tflops = (
lambda ms: 2.0
* BATCH
* OUT_H
* OUT_W
* IN_C
* KERNEL_H
* KERNEL_W
* KERNEL_N
/ ms
* 1e-9
)
if provider == "aten":
def fn():
return torch.conv2d(x, w, bias, stride, padding, dilation, groups)
elif provider == "triton_conv":
def fn():
return torch._inductor.triton_ops.conv(
x, w, bias, stride, padding, dilation, False, (0, 0), groups
)
elif provider == "triton_conv1x1":
def fn():
return torch._inductor.triton_ops.conv1x1(
x, w, bias, stride, padding, dilation, False, (0, 0), groups
)
if KERNEL_H != 1 or KERNEL_W != 1:
skip = True
elif provider == "autotune":
@torch._dynamo.optimize("inductor")
def wrap_conv(*args, **kwargs):
return torch.conv2d(*args, **kwargs)
def fn():
return wrap_conv(x, w, bias, stride, padding, dilation, groups)
# use cuda graph for fair comparison
elif provider != "autotune" and not skip:
# prepare new tensor
new_x = x.clone()
new_w = w.clone()
new_bias = bias.clone()
# warmp up for cudagraph
s = torch.cuda.Stream()
s.wait_stream(torch.cuda.current_stream())
with torch.cuda.stream(s):
for i in range(3):
fn()
torch.cuda.current_stream().wait_stream(s)
# capture
g = torch.cuda.CUDAGraph()
with torch.cuda.graph(g):
fn()
def fn():
x.copy_(new_x)
w.copy_(new_w)
bias.copy_(new_bias)
return g.replay()
if not skip:
ms, min_ms, max_ms = triton.testing.do_bench(fn, warmup=warmup, rep=rep)
return tflops(ms), tflops(max_ms), tflops(min_ms)
else:
return 0, 0, 0
bench_op.run(print_data=True)

View File

@ -0,0 +1,144 @@
import model
import torch
import torch._inductor.triton_ops
import triton
# The flag below controls whether to allow TF32 on matmul. This flag defaults to True.
torch.backends.cuda.matmul.allow_tf32 = True
# The flag below controls whether to allow TF32 on cuDNN. This flag defaults to True.
torch.backends.cudnn.allow_tf32 = True
# https://pytorch.org/blog/accelerating-pytorch-with-cuda-graphs/
useCudaGraph = False
# conv benchmarks
conv_confs = [
triton.testing.Benchmark(
x_names=["layout"],
x_vals=["nchw", "nhwc"],
line_arg="provider",
line_vals=["cublas", "triton"],
line_names=["cuBLAS", "Triton"],
ylabel="TFLOPS",
plot_name=f"resnet50-conv{i}-perf",
args={
"BATCH": BATCH,
"IN_H": IN_H,
"IN_W": IN_W,
"IN_C": IN_C,
"KERNEL_N": KERNEL_N,
"KERNEL_H": KERNEL_H,
"KERNEL_W": KERNEL_W,
"stride": stride,
"padding": padding,
},
)
for i, (
IN_H,
IN_W,
IN_C,
KERNEL_H,
KERNEL_W,
KERNEL_N,
stride,
padding,
) in enumerate(model.resnet50_layers)
for BATCH in [32]
]
@triton.testing.perf_report(conv_confs)
def bench_op(
# Tensor dimensions
BATCH,
IN_C,
IN_H,
IN_W,
KERNEL_N,
KERNEL_H,
KERNEL_W,
# provider
provider,
# parameters of conv
stride=(1, 1),
padding=(0, 0),
dilation=(1, 1),
groups=1,
dtype=torch.float32,
layout="nhwc",
warmup=25,
rep=75,
):
# allocate inputs, nchw
x = torch.randn((BATCH, IN_C, IN_H, IN_W), dtype=dtype, device="cuda")
w = torch.randn(
(KERNEL_N, IN_C // groups, KERNEL_H, KERNEL_W), dtype=dtype, device="cuda"
)
bias = torch.randn((KERNEL_N), dtype=dtype, device="cuda")
if layout == "nhwc":
x = x.to(memory_format=torch.channels_last)
w = w.to(memory_format=torch.channels_last)
OUT_H = (
IN_H + 2 * padding[0] - dilation[0] * (KERNEL_H - 1) - 1 + stride[0]
) // stride[0]
OUT_W = (
IN_W + 2 * padding[1] - dilation[1] * (KERNEL_W - 1) - 1 + stride[1]
) // stride[1]
tflops = (
lambda ms: 2.0
* BATCH
* OUT_H
* OUT_W
* IN_C
* KERNEL_H
* KERNEL_W
* KERNEL_N
/ ms
* 1e-9
)
if provider == "cublas":
def fn():
return torch.conv2d(x, w, bias, stride, padding, dilation, groups)
elif provider == "triton":
def fn():
return torch._inductor.triton_ops.conv(
x, w, bias, stride, padding, dilation, False, (0, 0), groups
)
# useCudaGraph won't change the TFLOPs,
# because do_bench() clear L2 cache to hide the latency of CPU launch time
if useCudaGraph:
new_x = x.clone()
new_w = w.clone()
new_bias = bias.clone()
# warmp up for cudagraph
s = torch.cuda.Stream()
s.wait_stream(torch.cuda.current_stream())
with torch.cuda.stream(s):
for i in range(3):
fn()
torch.cuda.current_stream().wait_stream(s)
# capture
g = torch.cuda.CUDAGraph()
with torch.cuda.graph(g):
fn()
def fn():
x.copy_(new_x)
w.copy_(new_w)
bias.copy_(new_bias)
return g.replay()
ms, min_ms, max_ms = triton.testing.do_bench(fn, warmup=warmup, rep=rep)
return tflops(ms), tflops(max_ms), tflops(min_ms)
bench_op.run(print_data=True)

View File

@ -0,0 +1,140 @@
import model
import torch
import torch._inductor.triton_ops
import triton
# https://pytorch.org/blog/accelerating-pytorch-with-cuda-graphs/
useCudaGraph = False
# conv benchmarks
conv_confs = [
triton.testing.Benchmark(
x_names=["layout"],
x_vals=["nchw", "nhwc"],
line_arg="provider",
line_vals=["cublas", "triton"],
line_names=["cuBLAS", "Triton"],
ylabel="TFLOPS",
plot_name=f"resnet50-conv1x1-{i}-performance",
args={
"BATCH": BATCH,
"IN_H": IN_H,
"IN_W": IN_W,
"IN_C": IN_C,
"KERNEL_N": KERNEL_N,
"KERNEL_H": KERNEL_H,
"KERNEL_W": KERNEL_W,
"stride": stride,
"padding": padding,
},
)
for i, (
IN_H,
IN_W,
IN_C,
KERNEL_H,
KERNEL_W,
KERNEL_N,
stride,
padding,
) in enumerate(model.resnet50_layers)
if KERNEL_H == 1 and KERNEL_W == 1
for BATCH in [32]
]
@triton.testing.perf_report(conv_confs)
def bench_op(
# Tensor dimensions
BATCH,
IN_C,
IN_H,
IN_W,
KERNEL_N,
KERNEL_H,
KERNEL_W,
# provider
provider,
# parameters of conv
stride=(1, 1),
padding=(0, 0),
dilation=(1, 1),
groups=1,
dtype=torch.float32,
layout="nhwc",
warmup=25,
rep=75,
):
# allocate inputs, nchw
x = torch.randn((BATCH, IN_C, IN_H, IN_W), dtype=dtype, device="cuda")
w = torch.randn(
(KERNEL_N, IN_C // groups, KERNEL_H, KERNEL_W), dtype=dtype, device="cuda"
)
bias = torch.randn((KERNEL_N), dtype=dtype, device="cuda")
if layout == "nhwc":
x = x.to(memory_format=torch.channels_last)
w = w.to(memory_format=torch.channels_last)
OUT_H = (
IN_H + 2 * padding[0] - dilation[0] * (KERNEL_H - 1) - 1 + stride[0]
) // stride[0]
OUT_W = (
IN_W + 2 * padding[1] - dilation[1] * (KERNEL_W - 1) - 1 + stride[1]
) // stride[1]
tflops = (
lambda ms: 2.0
* BATCH
* OUT_H
* OUT_W
* IN_C
* KERNEL_H
* KERNEL_W
* KERNEL_N
/ ms
* 1e-9
)
if provider == "cublas":
def fn():
return torch.conv2d(x, w, bias, stride, padding, dilation, groups)
elif provider == "triton":
def fn():
return torch._inductor.triton_ops.conv1x1(
x, w, bias, stride, padding, dilation, False, (0, 0), groups
)
if useCudaGraph:
# prepare new data
new_x = x.clone()
new_w = w.clone()
new_bias = bias.clone()
# warmp up for cudagraph
s = torch.cuda.Stream()
s.wait_stream(torch.cuda.current_stream())
with torch.cuda.stream(s):
for i in range(3):
fn()
torch.cuda.current_stream().wait_stream(s)
# capture
g = torch.cuda.CUDAGraph()
with torch.cuda.graph(g):
fn()
def fn():
x.copy_(new_x)
w.copy_(new_w)
bias.copy_(new_bias)
return g.replay()
ms, min_ms, max_ms = triton.testing.do_bench(fn, warmup=warmup, rep=rep)
return tflops(ms), tflops(max_ms), tflops(min_ms)
bench_op.run(print_data=True)

View File

@ -0,0 +1,298 @@
# flake8: noqa
import model
import torch
import torch._dynamo
import torch._inductor.config
import triton
from prettytable import PrettyTable
# torch._inductor.config.debug = True
torch._inductor.config.triton.convolution = "triton"
torch._inductor.config.triton.dense_indexing = True
torch.manual_seed(0)
useCudaGraph = True
class Func(object):
# conv
@torch._dynamo.optimize("inductor")
def conv_torchinductor(x, w, bias, stride, padding, dilation, groups):
y = torch.conv2d(x, w, None, stride, padding, dilation, groups)
return y
# conv
def conv(x, w, bias, stride, padding, dilation, groups):
y = torch.conv2d(x, w, None, stride, padding, dilation, groups)
return y
# conv+bias
@torch._dynamo.optimize("inductor")
def conv_add_torchinductor(x, w, bias, stride, padding, dilation, groups):
y = torch.conv2d(x, w, bias, stride, padding, dilation, groups)
return y
# conv+bias
def conv_add(x, w, bias, stride, padding, dilation, groups):
y = torch.conv2d(x, w, bias, stride, padding, dilation, groups)
return y
# relu(conv)
@torch._dynamo.optimize("inductor")
def conv_relu_torchinductor(x, w, bias, stride, padding, dilation, groups):
y = torch.conv2d(x, w, None, stride, padding, dilation, groups)
return torch.relu(y)
# relu(conv)
def conv_relu(x, w, bias, stride, padding, dilation, groups):
y = torch.conv2d(x, w, None, stride, padding, dilation, groups)
return torch.relu(y)
# relu(conv+bias)
@torch._dynamo.optimize("inductor")
def conv_add_relu_torchinductor(x, w, bias, stride, padding, dilation, groups):
y = torch.conv2d(x, w, bias, stride, padding, dilation, groups)
return torch.relu(y)
# relu(conv+bias)
def conv_add_relu(x, w, bias, stride, padding, dilation, groups):
y = torch.conv2d(x, w, bias, stride, padding, dilation, groups)
return torch.relu(y)
# bn(conv)
@torch._dynamo.optimize("inductor")
def conv_bn_torchinductor(
x,
w,
bias,
stride,
padding,
dilation,
groups,
running_mean,
running_var,
bn_weight,
bn_bias,
):
y = torch.conv2d(x, w, None, stride, padding, dilation, groups)
y = torch.batch_norm(
y,
weight=bn_weight,
bias=bn_bias,
running_mean=running_mean,
running_var=running_var,
training=False,
momentum=1,
eps=1e-5,
cudnn_enabled=True,
)
return y
# bn(conv)
def conv_bn(
x,
w,
bias,
stride,
padding,
dilation,
groups,
running_mean,
running_var,
bn_weight,
bn_bias,
):
y = torch.conv2d(x, w, None, stride, padding, dilation, groups)
y = torch.batch_norm(
y,
weight=bn_weight,
bias=bn_bias,
running_mean=running_mean,
running_var=running_var,
training=False,
momentum=1,
eps=1e-5,
cudnn_enabled=True,
)
return y
# relu(bn(conv))
@torch._dynamo.optimize("inductor")
def conv_bn_relu_torchinductor(
x,
w,
bias,
stride,
padding,
dilation,
groups,
running_mean,
running_var,
bn_weight,
bn_bias,
):
y = torch.conv2d(x, w, None, stride, padding, dilation, groups)
y = torch.batch_norm(
y,
weight=bn_weight,
bias=bn_bias,
running_mean=running_mean,
running_var=running_var,
training=False,
momentum=1,
eps=1e-5,
cudnn_enabled=True,
)
return torch.relu(y)
# relu(bn(conv))
def conv_bn_relu(
x,
w,
bias,
stride,
padding,
dilation,
groups,
running_mean,
running_var,
bn_weight,
bn_bias,
):
y = torch.conv2d(x, w, None, stride, padding, dilation, groups)
y = torch.batch_norm(
y,
weight=bn_weight,
bias=bn_bias,
running_mean=running_mean,
running_var=running_var,
training=False,
momentum=1,
eps=1e-5,
cudnn_enabled=True,
)
return torch.relu(y)
def cuda_graph(fn, x, w, bias):
new_x = x.clone()
new_w = w.clone()
if bias is not None:
new_bias = bias.clone()
# warmp up for cudagraph
s = torch.cuda.Stream()
s.wait_stream(torch.cuda.current_stream())
with torch.cuda.stream(s):
for i in range(3):
fn()
torch.cuda.current_stream().wait_stream(s)
# capture
g = torch.cuda.CUDAGraph()
with torch.cuda.graph(g):
fn()
def fn():
x.copy_(new_x)
w.copy_(new_w)
if bias is not None:
bias.copy_(new_bias)
return g.replay()
return fn
def bench(layer_params, layer_id, p, fusion_types=[""]):
BATCH = 32
IN_H, IN_W, IN_C, KERNEL_H, KERNEL_W, KERNEL_N, stride, padding = layer_params
dilation, groups = (1, 1), 1
dtype = torch.float32
OUT_H = (
IN_H + 2 * padding[0] - dilation[0] * (KERNEL_H - 1) - 1 + stride[0]
) // stride[0]
OUT_W = (
IN_W + 2 * padding[1] - dilation[1] * (KERNEL_W - 1) - 1 + stride[1]
) // stride[1]
tflops = (
lambda ms: 2.0
* BATCH
* OUT_H
* OUT_W
* IN_C
* KERNEL_H
* KERNEL_W
* KERNEL_N
/ ms
* 1e-9
)
# allocate inputs, nchw
x = torch.randn((BATCH, IN_C, IN_H, IN_W), dtype=dtype, device="cuda")
w = torch.randn(
(KERNEL_N, IN_C // groups, KERNEL_H, KERNEL_W), dtype=dtype, device="cuda"
)
row = [layer_id]
for fusion_type in fusion_types:
if fusion_type == "":
conv_torchinductor = getattr(Func, "conv_torchinductor")
conv = getattr(Func, "conv")
else:
conv_torchinductor = getattr(Func, f"conv_{fusion_type}_torchinductor")
conv = getattr(Func, f"conv_{fusion_type}")
if "add" in fusion_type:
bias = torch.randn((KERNEL_N,), dtype=dtype, device="cuda")
else:
bias = None
args = (x, w, bias, stride, padding, dilation, groups)
if "bn" in fusion_type:
running_mean = torch.randn((KERNEL_N), dtype=dtype, device="cuda")
running_var = torch.randn((KERNEL_N), dtype=dtype, device="cuda")
bn_weight = torch.randn((KERNEL_N), dtype=dtype, device="cuda")
bn_bias = torch.randn((KERNEL_N), dtype=dtype, device="cuda")
args += (
running_mean,
running_var,
bn_weight,
bn_bias,
)
def fn_conv():
return conv(*args)
def fn_conv_torchinductor():
return conv_torchinductor(*args)
if useCudaGraph:
fn_conv = cuda_graph(fn_conv, x, w, bias)
torch_conv_ms, _, _ = triton.testing.do_bench(fn_conv)
triton_conv_ms, _, _ = triton.testing.do_bench(fn_conv_torchinductor)
row.extend([tflops(torch_conv_ms), tflops(triton_conv_ms)])
p.add_row(row)
fusion_types = ["", "add", "relu", "add_relu", "bn", "bn_relu"]
p = PrettyTable()
field_names = ["layer"]
for fusion_type in fusion_types:
if fusion_type == "":
field_names.append("torch conv")
field_names.append("triton conv")
else:
field_names.append(f"torch conv+{fusion_type}")
field_names.append(f"triton conv+{fusion_type}")
p.field_names = field_names
p.float_format = ".3"
for id, layer in enumerate(model.resnet50_layers):
bench(layer, id, p, fusion_types)
print(p)

View File

@ -0,0 +1,121 @@
# flake8: noqa
import torch
import torch._dynamo
import torch._inductor.config
import triton
from prettytable import PrettyTable
# torch._inductor.config.debug = True
torch._inductor.config.triton.dense_indexing = True
torch.manual_seed(0)
# The flag below controls whether to allow TF32 on matmul.
torch.backends.cuda.matmul.allow_tf32 = True
class Func(object):
# mm
@torch._dynamo.optimize("inductor")
def mm(a, b, bias):
y = torch.mm(a, b)
return y
# mm+bias
@torch._dynamo.optimize("inductor")
def mm_add(a, b, bias):
y = torch.mm(a, b)
return y + bias
# relu(mm)
@torch._dynamo.optimize("inductor")
def mm_relu(a, b, bias):
y = torch.mm(a, b)
return torch.relu(y)
# relu(mm+bias)
@torch._dynamo.optimize("inductor")
def mm_add_relu(a, b, bias):
y = torch.mm(a, b)
y += bias
return torch.relu(y)
def bench(shape, layer_id, p, fusion_types=[""]):
dtype = torch.float16
M, K = shape[0]
_, N = shape[1]
torch.manual_seed(0)
# allocate inputs
a = torch.randn(shape[0], device="cuda", dtype=dtype)
b = torch.randn(shape[1], device="cuda", dtype=dtype)
def tflops(ms):
return M * K * N / ms * 1e-9
row = [layer_id]
for fusion_type in fusion_types:
if fusion_type == "":
fn_mm = getattr(Func, "mm")
else:
fn_mm = getattr(Func, f"mm_{fusion_type}")
if "add" in fusion_type:
bias = torch.randn((M, N), dtype=dtype, device="cuda")
else:
bias = None
args = (a, b, bias)
def fn():
return fn_mm(*args)
torch._inductor.config.triton.mm = "aten"
torch_mm_ms, _, _ = triton.testing.do_bench(fn)
torch._inductor.config.triton.mm = "triton"
# reset to force code gen new python code
torch._dynamo.reset()
torch._inductor.metrics.reset()
triton_mm_ms, _, _ = triton.testing.do_bench(fn)
assert (
torch._inductor.metrics.generated_kernel_count == 1
), "codegen #kernel != 1"
row.extend([tflops(torch_mm_ms), tflops(triton_mm_ms)])
p.add_row(row)
fusion_types = ["", "add", "relu", "add_relu"]
shapes = [
# alexnet
([128, 9216], [9216, 4096]),
([128, 4096], [4096, 4096]),
([128, 4096], [4096, 1000]),
# BERT
([2048, 768], [768, 768]),
([2048, 768], [768, 3072]),
([2048, 3072], [3072, 768]),
# hf_GPT2
([1024, 768], [768, 768]),
([1024, 768], [768, 3072]),
([1024, 3072], [3072, 768]),
([1024, 768], [768, 2304]),
]
p = PrettyTable()
field_names = ["layer"]
for fusion_type in fusion_types:
if fusion_type == "":
field_names.append("torch mm")
field_names.append("triton mm")
else:
field_names.append(f"torch mm+{fusion_type}")
field_names.append(f"triton mm+{fusion_type}")
p.field_names = field_names
p.float_format = ".3"
for id, shape in enumerate(shapes):
bench(shape, id, p, fusion_types)
print(p)

View File

@ -0,0 +1,13 @@
from torch.utils.benchmark import Timer
def time_with_torch_timer(fn, args, kwargs=None, iters=100):
kwargs = kwargs or {}
env = {"args": args, "kwargs": kwargs, "fn": fn}
fn_call = "fn(*args, **kwargs)"
# Measure end-to-end time
timer = Timer(stmt=f"{fn_call}", globals=env)
tt = timer.timeit(iters)
return tt

View File

@ -0,0 +1,61 @@
import torch
import torch._dynamo
import torch._dynamo.config
import torch._inductor.config as config
from benchmark_helper import time_with_torch_timer
@torch._dynamo.optimize("inductor", nopython=True)
def inductor_aten_bmm(a, b):
return torch.bmm(a, b)
@torch._dynamo.optimize("inductor", nopython=True)
def inductor_triton_bmm(a, b):
return torch.bmm(a, b)
def torch_bmm(a, b):
return torch.bmm(a, b)
def test_total_time(shapes):
print("shape; torch bmm; inductor aten bmm; inductor triton bmm")
for i in range(len(shapes)):
a_shape, b_shape = shapes[i]
print(a_shape, "x", b_shape, end="; ")
a = torch.randn(a_shape, device="cuda", dtype=torch.float16)
b = torch.randn(b_shape, device="cuda", dtype=a.dtype)
config.triton.use_bmm = False
inductor_aten_bmm(a, b)
config.triton.use_bmm = True
inductor_triton_bmm(a, b)
torch_ms = time_with_torch_timer(torch_bmm, (a, b)).mean * 1000
config.triton.use_bmm = False
ind_aten_ms = time_with_torch_timer(inductor_aten_bmm, (a, b)).mean * 1000
config.triton.use_bmm = True
ind_triton_ms = time_with_torch_timer(inductor_triton_bmm, (a, b)).mean * 1000
print(torch_ms, ind_aten_ms, ind_triton_ms, sep="; ")
if __name__ == "__main__":
shapes = [
# BERT (all)
([192, 128, 64], [192, 64, 128]),
([192, 128, 128], [192, 128, 64]),
# hf_GPT2 (all)
([12, 1024, 1024], [12, 1024, 64]),
([12, 1024, 64], [12, 64, 1024]),
# hf_Albert (all)
([12, 512, 64], [12, 64, 512]),
([12, 512, 512], [12, 512, 64]),
]
test_total_time(shapes)

View File

@ -0,0 +1,134 @@
import torch
import torch._dynamo
import torch._dynamo.config
import torch._inductor.config as config
import triton
from benchmark_helper import time_with_torch_timer
# The flag below controls whether to allow TF32 on matmul. This flag defaults to True.
torch.backends.cuda.matmul.allow_tf32 = True
# The flag below controls whether to allow TF32 on cuDNN. This flag defaults to True.
torch.backends.cudnn.allow_tf32 = True
@torch._dynamo.optimize("inductor", nopython=True)
def inductor_aten_mm(a, b):
return torch.mm(a, b)
@torch._dynamo.optimize("inductor", nopython=True)
def inductor_triton_mm(a, b):
return torch.mm(a, b)
def torch_mm(a, b):
return torch.mm(a, b)
def triton_mm(a, b):
return triton.ops.matmul(a, b)
def test_total_time(shapes):
print("shape; torch mm; triton mm; inductor aten mm; inductor triton mm")
for i in range(len(shapes)):
a_shape, b_shape = shapes[i]
print(a_shape, "x", b_shape, end="; ")
a = torch.randn(a_shape, device="cuda", dtype=torch.float16)
b = torch.randn(b_shape, device="cuda", dtype=a.dtype)
config.triton.mm = "aten"
inductor_aten_mm(a, b)
config.triton.mm = "triton"
inductor_triton_mm(a, b)
torch_ms = time_with_torch_timer(torch_mm, (a, b)).mean * 1000
triton_ms = time_with_torch_timer(triton_mm, (a, b)).mean * 1000
config.triton.mm = "aten"
ind_aten_ms = time_with_torch_timer(inductor_aten_mm, (a, b)).mean * 1000
config.triton.mm = "triton"
ind_triton_ms = time_with_torch_timer(inductor_triton_mm, (a, b)).mean * 1000
print(torch_ms, triton_ms, ind_aten_ms, ind_triton_ms, sep="; ")
torch._dynamo.reset()
def test_GPU_time(shapes):
print("shape; torch mm; triton mm; inductor aten mm; inductor triton mm")
for i in range(len(shapes)):
a_shape, b_shape = shapes[i]
print(a_shape, "x", b_shape, end="; ")
a = torch.randn(a_shape, device="cuda", dtype=torch.float16)
b = torch.randn(b_shape, device="cuda", dtype=a.dtype)
config.triton.mm = "aten"
inductor_aten_mm(a, b)
config.triton.mm = "triton"
inductor_triton_mm(a, b)
torch_ms, _, _ = triton.testing.do_bench(lambda: torch_mm(a, b))
triton_ms, _, _ = triton.testing.do_bench(lambda: triton_mm(a, b))
ind_aten_ms, _, _ = triton.testing.do_bench(lambda: inductor_aten_mm(a, b))
ind_triton_ms, _, _ = triton.testing.do_bench(lambda: inductor_triton_mm(a, b))
print(torch_ms, triton_ms, ind_aten_ms, ind_triton_ms, sep="; ")
torch._dynamo.reset()
if __name__ == "__main__":
shapes = [
# alexnet
([128, 9216], [9216, 4096]),
([128, 4096], [4096, 4096]),
([128, 4096], [4096, 1000]),
# BERT
([2048, 768], [768, 768]),
([2048, 768], [768, 3072]),
([2048, 3072], [3072, 768]),
# hf_GPT2
([1024, 768], [768, 768]),
([1024, 768], [768, 3072]),
([1024, 3072], [3072, 768]),
([1024, 768], [768, 2304]),
]
print("test total time")
test_total_time(shapes)
print("test GPU time")
test_GPU_time(shapes)
# Results Preview on AWS AI cluster
"""
test total time
shape; torch mm; triton mm; inductor aten mm; inductor triton mm
[128, 9216] x [9216, 4096]; 0.07240759208798409; 0.10885953903198242; 0.20063146017491817; 0.20054904278367758
[128, 4096] x [4096, 4096]; 0.03640300128608942; 0.10960095096379519; 0.09948539081960917; 0.0996188772842288
[128, 4096] x [4096, 1000]; 0.02215010579675436; 0.12592008337378502; 0.031120930798351765; 0.0370654184371233
[2048, 768] x [768, 768]; 0.023501068353652954; 0.10804693214595318; 0.03004650119692087; 0.0276932492852211
[2048, 768] x [768, 3072]; 0.045639658346772194; 0.10883208829909563; 0.062736920081079; 0.06480381824076176
[2048, 3072] x [3072, 768]; 0.054093082435429096; 0.10804777964949608; 0.08744294755160809; 0.07766005117446184
[1024, 768] x [768, 768]; 0.021525858901441097; 0.10909941978752613; 0.02656651195138693; 0.02683836966753006
[1024, 768] x [768, 3072]; 0.027319076471030712; 0.10825308971107006; 0.040118801407516; 0.039282338693737984
[1024, 3072] x [3072, 768]; 0.034132059663534164; 0.10594133753329515; 0.05069758277386427; 0.04572632722556591
[1024, 768] x [768, 2304]; 0.02529360819607973; 0.10486091021448374; 0.03724239766597748; 0.036449190229177475
test GPU time
shape; torch mm; triton mm; inductor aten mm; inductor triton mm
[128, 9216] x [9216, 4096]; 0.09113600105047226; 0.09011200070381165; 0.21606400609016418; 0.21606400609016418
[128, 4096] x [4096, 4096]; 0.053247999399900436; 0.05222399905323982; 0.1157120019197464; 0.1157120019197464
[128, 4096] x [4096, 1000]; 0.026623999699950218; 0.02969600073993206; 0.04710400104522705; 0.05222399905323982
[2048, 768] x [768, 768]; 0.02457600086927414; 0.020479999482631683; 0.04095999896526337; 0.03993599861860275
[2048, 768] x [768, 3072]; 0.05119999870657921; 0.05222399905323982; 0.07475200295448303; 0.07577600330114365
[2048, 3072] x [3072, 768]; 0.05939200147986412; 0.05222399905323982; 0.09830400347709656; 0.0870399996638298
[1024, 768] x [768, 768]; 0.01945599913597107; 0.016383999958634377; 0.03276799991726875; 0.03276799991726875
[1024, 768] x [768, 3072]; 0.03174399957060814; 0.03276799991726875; 0.053247999399900436; 0.053247999399900436
[1024, 3072] x [3072, 768]; 0.04403200000524521; 0.03379200026392937; 0.06860800087451935; 0.062463998794555664
[1024, 768] x [768, 2304]; 0.02969600073993206; 0.02969600073993206; 0.04915200173854828; 0.048128001391887665
"""

View File

@ -0,0 +1,100 @@
import torch
import torch._dynamo
import torch._inductor.config as inductor_config
from benchmark_helper import time_with_torch_timer
inductor_config.triton.mm = "triton"
@torch._dynamo.optimize("inductor", nopython=True)
def inductor_mm(a, b):
return torch.mm(a, b)
def torch_mm_relu(a, b):
return torch.nn.functional.relu(torch.mm(a, b))
def torch_mm(a, b):
return torch.mm(a, b)
if __name__ == "__main__":
# Real shapes from torchbench
a_shapes = [
[2048, 768],
[64, 1280],
[2048, 768],
[32, 2048],
[1, 39200],
[128, 3072],
[16, 1280],
]
b_shapes = [
[768, 3072],
[1280, 1000],
[768, 768],
[2048, 1000],
[39200, 50],
[3072, 1000],
[1280, 1000],
]
# Artificial larger shapes
a_shapes += [[10240, 512], [10240, 1024]]
b_shapes += [[512, 10240], [1024, 10240]]
for i in range(len(a_shapes)):
a_shape = a_shapes[i]
b_shape = b_shapes[i]
print("Shape:", a_shape, "x", b_shape)
a = torch.randn(a_shape, device="cuda", dtype=torch.float16)
b = torch.randn(b_shape, device="cuda", dtype=a.dtype)
time_with_torch_timer(torch_mm, (a, b), string_id="torch mm")
time_with_torch_timer(torch_mm_relu, (a, b), string_id="torch mm + relu")
time_with_torch_timer(inductor_mm, (a, b), string_id="inductor mm")
# Results obtained on the AWS AI cluster
# CPU: Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz
# GPU: NVIDIA A100-SXM 40GB memory
"""
Shape: [2048, 768] x [768, 3072]
torch mm mean: 0.0592 ms
torch mm + relu mean: 0.0759 ms
inductor mm mean: 0.0653 ms
Shape: [64, 1280] x [1280, 1000]
torch mm mean: 0.0231 ms
torch mm + relu mean: 0.0316 ms
inductor mm mean: 0.0252 ms
Shape: [2048, 768] x [768, 768]
torch mm mean: 0.0190 ms
torch mm + relu mean: 0.0277 ms
inductor mm mean: 0.0274 ms
Shape: [32, 2048] x [2048, 1000]
torch mm mean: 0.0188 ms
torch mm + relu mean: 0.0290 ms
inductor mm mean: 0.0244 ms
Shape: [1, 39200] x [39200, 50]
torch mm mean: 0.0134 ms
torch mm + relu mean: 0.0234 ms
inductor mm mean: 0.0290 ms
Shape: [128, 3072] x [3072, 1000]
torch mm mean: 0.0181 ms
torch mm + relu mean: 0.0322 ms
inductor mm mean: 0.0319 ms
Shape: [16, 1280] x [1280, 1000]
torch mm mean: 0.0188 ms
torch mm + relu mean: 0.0289 ms
inductor mm mean: 0.0255 ms
Shape: [10240, 512] x [512, 10240]
torch mm mean: 0.4589 ms
torch mm + relu mean: 0.7896 ms
inductor mm mean: 0.5090 ms
Shape: [10240, 1024] x [1024, 10240]
torch mm mean: 0.9152 ms
torch mm + relu mean: 1.2124 ms
inductor mm mean: 0.9462 ms
"""

View File

@ -0,0 +1,176 @@
#!/usr/bin/env python3
import argparse
import inspect
import sys
import numpy as np
import tabulate
import torch
import torch._inductor
from torch._dynamo.optimizations.backends import cudagraphs_inner
from torch._dynamo.testing import same
from torch._inductor.compile_fx import compile_fx
from torch._inductor.utils import timed
try:
import test.test_torchinductor as tti
except ImportError:
tti = None
def compute_speedups(args, models, example_inputs):
expected = models[0](*example_inputs)
for model in models[1:]:
actual = model(*example_inputs)
assert same(actual, expected), expected[0] - actual[0]
timings = np.zeros((args.repeat, len(models)), np.float64)
for rep in range(args.repeat):
# interleave the runs to handle frequency scaling and load changes
for m, model in enumerate(models):
timings[rep, m] = timed(model, example_inputs)
median = np.median(timings, axis=0)
return (median[0] / median[1:]).tolist()
def microbenchmark(args, model, example_inputs):
compiled_fn = compile_fx(torch.fx.symbolic_trace(model), example_inputs)
cudagraphs_eager = cudagraphs_inner(model, example_inputs, copy_outputs=False)
cudagraphs_jit = cudagraphs_inner(
torch.jit.trace(model, example_inputs), example_inputs, copy_outputs=False
)
return compute_speedups(
args,
[cudagraphs_eager, cudagraphs_jit, compiled_fn],
example_inputs,
)
class MyModel1(torch.nn.Module):
def __init__(self):
super().__init__()
self.model = torch.nn.Sequential(
torch.nn.Linear(1024, 1024),
torch.nn.ReLU(),
)
def forward(self, input):
# return (self.model(input) + 1,)
return (self.model(input),)
class MyModel2(torch.nn.Module):
def forward(self, x, y):
# return x / (torch.abs(x) + 1.0),
return (x + y,)
class MicroBenchmarks:
@staticmethod
def add(a, b):
return (a + b,)
@staticmethod
def scale(x, m, d):
return ((x - m) / torch.clip(d, 1e-4),)
@staticmethod
def abs_norm(x):
return (x / (torch.abs(x) + 1),)
@staticmethod
def add_relu_softmax(x, a):
return (torch.softmax(torch.relu(x + a), -1),)
@staticmethod
def sum(a, b):
return ((a + b).sum(),)
def main():
parser = argparse.ArgumentParser()
parser.add_argument(
"--filter", "-k", action="append", help="filter benchmarks with regexp"
)
parser.add_argument(
"--exclude", "-x", action="append", help="filter benchmarks with regexp"
)
parser.add_argument("--devices", "-d", action="append", help="cpu or cuda")
parser.add_argument("--size", "-s", action="append", help="cpu or cuda")
parser.add_argument(
"--repeat", "-n", type=int, default=30, help="number of timing runs"
)
parser.add_argument(
"--threads", "-t", type=int, help="number of threads to use for eager"
)
parser.add_argument(
"--verbose", "-v", action="store_true", help="enable verbose debug printouts"
)
parser.add_argument(
"--nvfuser", action="store_true", help="enable nvfuser globally"
)
parser.add_argument("--transpose", action="store_true", help="transpose one input")
parser.add_argument("--broadcast", action="store_true", help="broadcast one input")
args = parser.parse_args()
# defaults
args.devices = args.devices or ["cpu", "cuda"]
args.filter = args.filter or [r"."]
args.exclude = args.exclude or [r"^$"]
args.size = args.size or [64, 256, 1024, 4096, 8192]
if args.nvfuser:
torch._C._jit_override_can_fuse_on_cpu(False)
torch._C._jit_override_can_fuse_on_gpu(False)
torch._C._jit_set_texpr_fuser_enabled(False)
torch._C._jit_set_nvfuser_enabled(True)
else:
torch._C._jit_override_can_fuse_on_cpu(torch._C._llvm_enabled())
torch._C._jit_override_can_fuse_on_gpu(True)
torch._C._jit_set_texpr_fuser_enabled(True)
if torch.cuda.is_available():
torch._C._jit_set_nvfuser_enabled(False)
if args.threads:
torch.set_num_threads(args.threads)
torch._inductor.config.cpp.threads = args.threads
if args.verbose:
torch._inductor.config.debug = True
torch._inductor.config.triton.autotune = True
rows = []
for model in (MicroBenchmarks.sum,):
nargs = len(inspect.signature(model).parameters)
for device in args.devices:
for n in args.size:
n = int(n)
sys.stdout.write(f"{model.__name__:10} {device:4} {n:5} ")
sys.stdout.flush()
inputs = [torch.rand((n, n), device=device) for _ in range(nargs)]
if args.broadcast:
inputs[-1] = torch.rand((1, n), device=device)
if args.transpose:
inputs[-1] = inputs[-1].transpose(0, 1)
result = microbenchmark(args, model, inputs)
rows.append([model.__name__, device, str(n)] + result)
print(" ".join(f"{v:.2f}x" for v in result))
print(
tabulate.tabulate(
rows,
headers=[
"model",
"dev",
"n",
"ts",
"inductor",
],
)
)
if __name__ == "__main__":
main()

View File

@ -0,0 +1,26 @@
# resnet50 layer shape
resnet50_layers = (
# IN_H, IN_W, IN_C, KERNEL_H, KERNEL_W, KERNEL_N, stride, padding
(224, 224, 3, 7, 7, 64, (2, 2), (0, 0)),
# conv2_x
(56, 56, 64, 1, 1, 64, (1, 1), (0, 0)),
(56, 56, 64, 3, 3, 64, (1, 1), (0, 0)),
(56, 56, 64, 1, 1, 256, (1, 1), (0, 0)),
# conv3_x
(56, 56, 256, 1, 1, 128, (2, 2), (0, 0)),
(28, 28, 128, 3, 3, 128, (1, 1), (0, 0)),
(28, 28, 128, 1, 1, 512, (1, 1), (0, 0)),
# conv4_x
(28, 28, 512, 1, 1, 256, (2, 2), (0, 0)),
(14, 14, 256, 3, 3, 256, (1, 1), (0, 0)),
(14, 14, 256, 1, 1, 1024, (1, 1), (0, 0)),
# conv5_x
(14, 14, 1024, 1, 1, 512, (2, 2), (0, 0)),
(7, 7, 512, 3, 3, 512, (1, 1), (0, 0)),
(7, 7, 512, 1, 1, 2048, (1, 1), (0, 0)),
)
alexnet_layers = (
# IN_H, IN_W, IN_C, KERNEL_H, KERNEL_W, KERNEL_N, stride, padding
(224, 224, 3, 11, 11, 64, (4, 4), (2, 2)),
)

View File

@ -0,0 +1,115 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([1024, 30000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([1024, 30000], f16), T([1024, 30000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([2, 64, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([2, 64, 512, 512], f16), T([2, 64, 512, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([2, 1, 1, 512], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 36, ((T([2, 64, 512, 64], f16), [128, 512, 64]), {})
cnt: 12, ((T([2, 64, 64, 512], f16), [128, 64, 512]), {})
cnt: 12, ((T([128, 512, 512], f16), [2, 64, 512, 512]), {})
cnt: 12, ((T([128, 512, 64], f16), [2, 64, 512, 64]), {})
cnt: 36, ((T([2, 512, 64, 64], f16), [2, 512, 4096]), {})
cnt: 12, ((T([2, 512, 4096], f16), [1024, 4096]), {})
Operator: aten.add.Tensor
cnt: 4, ((T([2, 512, 128], f16), T([2, 512, 128], f16)), {})
cnt: 12, ((T([2, 64, 512, 512], f16), T([2, 1, 1, 512], f16)), {})
cnt: 72, ((T([2, 512, 4096], f16), T([2, 512, 4096], f16)), {})
cnt: 36, ((T([2, 512, 16384], f16), T([2, 512, 16384], f16)), {})
cnt: 12, ((T([2, 512, 16384], f16), 1.0), {})
cnt: 1, ((T([2, 512, 128], f16), 1.0), {})
cnt: 99, ((T([4096], f16), T([4096], f16)), {})
cnt: 11, ((T([4096, 16384], f16), T([4096, 16384], f16)), {})
cnt: 11, ((T([16384], f16), T([16384], f16)), {})
cnt: 11, ((T([16384, 4096], f16), T([16384, 4096], f16)), {})
cnt: 44, ((T([4096, 4096], f16), T([4096, 4096], f16)), {})
cnt: 1, ((T([30000, 128], f16), T([30000, 128], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([2, 512, 128], f16), T([1, 512, 128], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([4096], f16), T([1024, 128], f16), T([128, 4096], f16, stride=(1, 128))), {})
cnt: 48, ((T([4096], f16), T([1024, 4096], f16), T([4096, 4096], f16, stride=(1, 4096))), {})
cnt: 12, ((T([16384], f16), T([1024, 4096], f16), T([4096, 16384], f16, stride=(1, 4096))), {})
cnt: 12, ((T([4096], f16), T([1024, 16384], f16), T([16384, 4096], f16, stride=(1, 16384))), {})
cnt: 1, ((T([128], f16), T([1024, 4096], f16), T([4096, 128], f16, stride=(1, 4096))), {})
cnt: 1, ((T([30000], f16), T([1024, 128], f16), T([128, 30000], f16, stride=(1, 128))), {})
Operator: aten.bmm.default
cnt: 12, ((T([128, 512, 64], f16), T([128, 64, 512], f16)), {})
cnt: 12, ((T([128, 512, 512], f16), T([128, 512, 64], f16)), {})
cnt: 12, ((T([128, 512, 512], f16, stride=(262144, 1, 512)), T([128, 512, 64], f16)), {})
cnt: 12, ((T([128, 512, 64], f16), T([128, 64, 512], f16, stride=(32768, 1, 64))), {})
cnt: 12, ((T([128, 64, 512], f16, stride=(32768, 1, 64)), T([128, 512, 512], f16)), {})
cnt: 12, ((T([128, 512, 512], f16), T([128, 512, 64], f16, stride=(32768, 1, 512))), {})
Operator: aten.clone.default
cnt: 2, ((T([2, 512], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([2, 512], i64), T([2, 512], i64)), {})
Operator: aten.div.Tensor
cnt: 24, ((T([2, 64, 512, 512], f16), 8.0), {})
Operator: aten.embedding.default
cnt: 1, ((T([30000, 128], f16), T([2, 512], i64), 0), {})
cnt: 1, ((T([2, 128], f16), T([2, 512], i64, stride=(0, 1))), {})
cnt: 1, ((T([512, 128], f16), T([1, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 512, 128], f16), T([1, 512], i64), 512, -1, False), {})
cnt: 1, ((T([2, 512, 128], f16), T([2, 512], i64, stride=(0, 1)), 2, -1, False), {})
cnt: 1, ((T([2, 512, 128], f16), T([2, 512], i64), 30000, 0, False), {})
Operator: aten.mm.default
cnt: 1, ((T([1024, 30000], f16), T([30000, 128], f16)), {})
cnt: 1, ((T([30000, 1024], f16, stride=(1, 30000)), T([1024, 128], f16)), {})
cnt: 1, ((T([1024, 128], f16), T([128, 4096], f16)), {})
cnt: 1, ((T([128, 1024], f16, stride=(1, 128)), T([1024, 4096], f16)), {})
cnt: 12, ((T([1024, 4096], f16), T([4096, 16384], f16)), {})
cnt: 12, ((T([4096, 1024], f16, stride=(1, 4096)), T([1024, 16384], f16)), {})
cnt: 12, ((T([1024, 16384], f16), T([16384, 4096], f16)), {})
cnt: 12, ((T([16384, 1024], f16, stride=(1, 16384)), T([1024, 4096], f16)), {})
cnt: 48, ((T([1024, 4096], f16), T([4096, 4096], f16)), {})
cnt: 48, ((T([4096, 1024], f16, stride=(1, 4096)), T([1024, 4096], f16)), {})
cnt: 1, ((T([1024, 4096], f16), T([4096, 128], f16)), {})
cnt: 1, ((T([4096, 1024], f16, stride=(1, 4096)), T([1024, 128], f16)), {})
Operator: aten.mul.Scalar
cnt: 1, ((T([2, 512, 128], f16), 3.0), {})
cnt: 12, ((T([2, 512, 16384], f16), 3.0), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([2, 1, 1, 512], f16), -65504.0), {})
cnt: 24, ((T([2, 512, 16384], f16), 0.5), {})
cnt: 24, ((T([2, 512, 16384], f16), 0.044715), {})
cnt: 24, ((T([2, 512, 16384], f16), 0.7978845608028654), {})
cnt: 48, ((T([2, 512, 16384], f16), T([2, 512, 16384], f16)), {})
cnt: 2, ((T([2, 512, 128], f16), 0.5), {})
cnt: 2, ((T([2, 512, 128], f16), 0.044715), {})
cnt: 2, ((T([2, 512, 128], f16), 0.7978845608028654), {})
cnt: 4, ((T([2, 512, 128], f16), T([2, 512, 128], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 2, ((T([2, 512, 128], f16), [128], T([128], f16), T([128], f16), 1e-12), {})
cnt: 24, ((T([2, 512, 4096], f16), [4096], T([4096], f16), T([4096], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 2, ((T([2, 512, 128], f16), T([2, 512, 128], f16), [128], T([2, 512, 1], f32), T([2, 512, 1], f32), T([128], f16), T([128], f16), [True, True, True]), {})
cnt: 24, ((T([2, 512, 4096], f16), T([2, 512, 4096], f16), [4096], T([2, 512, 1], f32), T([2, 512, 1], f32), T([4096], f16), T([4096], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([1024, 30000], f16), T([1024], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([1024, 30000], f16), T([1024], i64), None, 1, -100), {})
Operator: aten.pow.Tensor_Scalar
cnt: 12, ((T([2, 512, 16384], f16), 3.0), {})
cnt: 1, ((T([2, 512, 128], f16), 3.0), {})
cnt: 1, ((T([2, 512, 128], f16), 2.0), {})
cnt: 12, ((T([2, 512, 16384], f16), 2.0), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([2, 1, 1, 512], f16), 1.0), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([1024, 30000], f16), [0], True), {})
cnt: 1, ((T([1024, 128], f16), [0], True), {})
cnt: 61, ((T([1024, 4096], f16), [0], True), {})
cnt: 12, ((T([1024, 16384], f16), [0], True), {})
cnt: 1, ((T([2, 512, 128], f16), [0], True), {})
Operator: aten.tanh.default
cnt: 12, ((T([2, 512, 16384], f16),), {})
cnt: 1, ((T([2, 512, 128], f16),), {})
Operator: aten.tanh_backward.default
cnt: 1, ((T([2, 512, 128], f16), T([2, 512, 128], f16)), {})
cnt: 12, ((T([2, 512, 16384], f16), T([2, 512, 16384], f16)), {})

View File

@ -0,0 +1,110 @@
Operator: aten._log_softmax.default
cnt: 2, ((T([2, 512], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 2, ((T([2, 512], f16), T([2, 512], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([2, 64, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([2, 64, 512, 512], f16), T([2, 64, 512, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([2, 1, 1, 512], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 36, ((T([2, 64, 512, 64], f16), [128, 512, 64]), {})
cnt: 12, ((T([2, 64, 64, 512], f16), [128, 64, 512]), {})
cnt: 12, ((T([128, 512, 512], f16), [2, 64, 512, 512]), {})
cnt: 12, ((T([128, 512, 64], f16), [2, 64, 512, 64]), {})
cnt: 36, ((T([2, 512, 64, 64], f16), [2, 512, 4096]), {})
cnt: 12, ((T([2, 512, 4096], f16), [1024, 4096]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([2, 512, 128], f16), T([2, 512, 128], f16)), {})
cnt: 12, ((T([2, 64, 512, 512], f16), T([2, 1, 1, 512], f16)), {})
cnt: 72, ((T([2, 512, 4096], f16), T([2, 512, 4096], f16)), {})
cnt: 36, ((T([2, 512, 16384], f16), T([2, 512, 16384], f16)), {})
cnt: 12, ((T([2, 512, 16384], f16), 1.0), {})
cnt: 1, ((T([], f16), T([], f16)), {})
cnt: 99, ((T([4096], f16), T([4096], f16)), {})
cnt: 11, ((T([4096, 16384], f16), T([4096, 16384], f16)), {})
cnt: 11, ((T([16384], f16), T([16384], f16)), {})
cnt: 11, ((T([16384, 4096], f16), T([16384, 4096], f16)), {})
cnt: 44, ((T([4096, 4096], f16), T([4096, 4096], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([2, 512, 128], f16), T([1, 512, 128], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([4096], f16), T([1024, 128], f16), T([128, 4096], f16, stride=(1, 128))), {})
cnt: 48, ((T([4096], f16), T([1024, 4096], f16), T([4096, 4096], f16, stride=(1, 4096))), {})
cnt: 12, ((T([16384], f16), T([1024, 4096], f16), T([4096, 16384], f16, stride=(1, 4096))), {})
cnt: 12, ((T([4096], f16), T([1024, 16384], f16), T([16384, 4096], f16, stride=(1, 16384))), {})
cnt: 1, ((T([2], f16), T([1024, 4096], f16), T([4096, 2], f16, stride=(1, 4096))), {})
Operator: aten.bmm.default
cnt: 12, ((T([128, 512, 64], f16), T([128, 64, 512], f16)), {})
cnt: 12, ((T([128, 512, 512], f16), T([128, 512, 64], f16)), {})
cnt: 12, ((T([128, 512, 512], f16, stride=(262144, 1, 512)), T([128, 512, 64], f16)), {})
cnt: 12, ((T([128, 512, 64], f16), T([128, 64, 512], f16, stride=(32768, 1, 64))), {})
cnt: 12, ((T([128, 64, 512], f16, stride=(32768, 1, 64)), T([128, 512, 512], f16)), {})
cnt: 12, ((T([128, 512, 512], f16), T([128, 512, 64], f16, stride=(32768, 1, 512))), {})
Operator: aten.cat.default
cnt: 1, (([T([2, 512, 1], f16), T([2, 512, 1], f16)], 2), {})
Operator: aten.clamp.default
cnt: 2, ((T([2], i64), 0, 512), {})
Operator: aten.clone.default
cnt: 1, ((T([2, 512], i64),), {})
cnt: 2, ((T([2], i64),), {})
Operator: aten.copy_.default
cnt: 1, ((T([2, 512], i64), T([2, 512], i64)), {})
cnt: 2, ((T([2], i64), T([2], i64)), {})
Operator: aten.div.Tensor
cnt: 24, ((T([2, 64, 512, 512], f16), 8.0), {})
cnt: 2, ((T([], f16), 2), {})
Operator: aten.embedding.default
cnt: 1, ((T([30000, 128], f16), T([2, 512], i64), 0), {})
cnt: 1, ((T([2, 128], f16), T([2, 512], i64, stride=(0, 1))), {})
cnt: 1, ((T([512, 128], f16), T([1, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 512, 128], f16), T([1, 512], i64), 512, -1, False), {})
cnt: 1, ((T([2, 512, 128], f16), T([2, 512], i64, stride=(0, 1)), 2, -1, False), {})
cnt: 1, ((T([2, 512, 128], f16), T([2, 512], i64), 30000, 0, False), {})
Operator: aten.mm.default
cnt: 1, ((T([1024, 2], f16), T([2, 4096], f16)), {})
cnt: 1, ((T([2, 1024], f16, stride=(1, 2)), T([1024, 4096], f16)), {})
cnt: 12, ((T([1024, 4096], f16), T([4096, 16384], f16)), {})
cnt: 12, ((T([4096, 1024], f16, stride=(1, 4096)), T([1024, 16384], f16)), {})
cnt: 12, ((T([1024, 16384], f16), T([16384, 4096], f16)), {})
cnt: 12, ((T([16384, 1024], f16, stride=(1, 16384)), T([1024, 4096], f16)), {})
cnt: 48, ((T([1024, 4096], f16), T([4096, 4096], f16)), {})
cnt: 48, ((T([4096, 1024], f16, stride=(1, 4096)), T([1024, 4096], f16)), {})
cnt: 1, ((T([1024, 4096], f16), T([4096, 128], f16)), {})
cnt: 1, ((T([4096, 1024], f16, stride=(1, 4096)), T([1024, 128], f16)), {})
Operator: aten.mul.Scalar
cnt: 12, ((T([2, 512, 16384], f16), 3.0), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([2, 1, 1, 512], f16), -65504.0), {})
cnt: 24, ((T([2, 512, 16384], f16), 0.5), {})
cnt: 24, ((T([2, 512, 16384], f16), 0.044715), {})
cnt: 24, ((T([2, 512, 16384], f16), 0.7978845608028654), {})
cnt: 48, ((T([2, 512, 16384], f16), T([2, 512, 16384], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 1, ((T([2, 512, 128], f16), [128], T([128], f16), T([128], f16), 1e-12), {})
cnt: 24, ((T([2, 512, 4096], f16), [4096], T([4096], f16), T([4096], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 24, ((T([2, 512, 4096], f16), T([2, 512, 4096], f16), [4096], T([2, 512, 1], f32), T([2, 512, 1], f32), T([4096], f16), T([4096], f16), [True, True, True]), {})
cnt: 1, ((T([2, 512, 128], f16), T([2, 512, 128], f16), [128], T([2, 512, 1], f32), T([2, 512, 1], f32), T([128], f16), T([128], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 2, ((T([], f16), T([2, 512], f16), T([2], i64), None, 1, 512, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 2, ((T([2, 512], f16), T([2], i64), None, 1, 512), {})
Operator: aten.pow.Tensor_Scalar
cnt: 12, ((T([2, 512, 16384], f16), 3.0), {})
cnt: 12, ((T([2, 512, 16384], f16), 2.0), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([2, 1, 1, 512], f16), 1.0), {})
Operator: aten.split.Tensor
cnt: 1, ((T([2, 512, 2], f16), 1, -1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([1024, 2], f16), [0], True), {})
cnt: 61, ((T([1024, 4096], f16), [0], True), {})
cnt: 12, ((T([1024, 16384], f16), [0], True), {})
cnt: 1, ((T([2, 512, 128], f16), [0], True), {})
Operator: aten.tanh.default
cnt: 12, ((T([2, 512, 16384], f16),), {})
Operator: aten.tanh_backward.default
cnt: 12, ((T([2, 512, 16384], f16), T([2, 512, 16384], f16)), {})

View File

@ -0,0 +1,186 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([1024, 50265], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([1024, 50265], f16), T([1024, 50265], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([1, 1024, 12, 513], f16, stride=(6303744, 513, 525312, 1)), -1, True), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([1, 1024, 12, 513], f32), T([1, 1024, 12, 513], f32), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([1, 1, 1, 1024], f32),), {'dtype': f16})
cnt: 1, ((T([1, 1024], b8),), {'dtype': i32})
cnt: 1, ((T([1, 1024], i64),), {'dtype': i32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([1, 1024], i32),), {'dtype': i64})
cnt: 12, ((T([1, 1024, 1, 1], b8),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 12, ((T([1, 1024, 12, 513], f32),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 12, ((T([1, 1024, 12, 513], f16, stride=(6303744, 513, 525312, 1)),), {'dtype': f32, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 12, ((T([12, 3, 512, 64, 1], f16), [36, 512, 64]), {})
cnt: 12, ((T([12, 3, 64, 512, 1], f16), [36, 64, 512]), {})
cnt: 12, ((T([12, 4, 768, 64, 1], f16), [48, 768, 64]), {})
cnt: 24, ((T([1024, 1, 12, 64], f16), [1024, 1, 768]), {})
cnt: 12, ((T([12, 4, 256, 1, 64], f16), [48, 256, 64]), {})
cnt: 12, ((T([12, 4, 768, 64], i64), [2359296]), {})
cnt: 12, ((T([12, 3, 512, 64], f16), [1179648]), {})
cnt: 24, ((T([12, 3, 512, 64], i64), [1179648]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([1, 1024], i64), 1), {})
cnt: 50, ((T([1, 1024, 768], f16), T([1, 1024, 768], f16)), {})
cnt: 36, ((T([12, 3, 512, 513], f16), T([12, 3, 512, 513], f16)), {})
cnt: 24, ((T([1024, 1, 768], f16), T([1024, 1, 768], f16)), {})
cnt: 1, ((T([50265, 768], f16), T([50265, 768], f16)), {})
Operator: aten.add_.Tensor
cnt: 12, ((T([1, 1024, 12, 513], f16, stride=(6303744, 513, 525312, 1)), T([1, 1024, 1, 513], f16)), {})
Operator: aten.addmm.default
cnt: 49, ((T([768], f16), T([1024, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([1024, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([1024, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([50265], f16), T([1024, 768], f16), T([768, 50265], f16, stride=(1, 768))), {})
Operator: aten.any.default
cnt: 1, ((T([1024], b8),), {})
Operator: aten.bmm.default
cnt: 12, ((T([36, 512, 64], f16), T([36, 64, 512], f16)), {})
cnt: 12, ((T([48, 256, 768], f16, stride=(197120, 769, 1)), T([48, 768, 64], f16)), {})
cnt: 12, ((T([48, 768, 256], f16, stride=(197120, 1, 769)), T([48, 256, 64], f16)), {})
cnt: 12, ((T([48, 256, 64], f16), T([48, 64, 768], f16, stride=(49152, 1, 64))), {})
cnt: 12, ((T([36, 64, 512], f16, stride=(32768, 1, 64)), T([36, 512, 512], f16)), {})
cnt: 12, ((T([36, 512, 512], f16), T([36, 512, 64], f16, stride=(32768, 1, 512))), {})
Operator: aten.clone.default
cnt: 2, ((T([1, 1024], i64),), {})
Operator: aten.constant_pad_nd.default
cnt: 12, ((T([12, 3, 512, 512], f16), [0, 0, 0, 1], 0.0), {})
cnt: 12, ((T([1, 3, 512, 512], f16), [0, 0, 0, 1], 0.0), {})
cnt: 12, ((T([12, 1024, 64], f16, stride=(64, 768, 1)), [0, 0, 256, 256], -1.0), {})
cnt: 12, ((T([12, 4, 256, 513], f16, stride=(513, 1575936, 6156, 1)), [0, 257], 0.0), {})
cnt: 12, ((T([12, 4, 256, 770], f16), [0, -257]), {})
cnt: 12, ((T([12, 1536, 64], f16), [0, 0, -256, -256]), {})
cnt: 12, ((T([12, 3, 513, 512], f16), [0, 0, 0, -1]), {})
Operator: aten.copy_.default
cnt: 2, ((T([1, 1024], i64), T([1, 1024], i64)), {})
cnt: 12, ((T([12, 3, 256, 257], f16, stride=(525312, 131328, 513, 1)), T([12, 3, 256, 257], f16, stride=(787968, 262656, 513, 1))), {})
cnt: 12, ((T([12, 256, 257], f16, stride=(525312, 513, 1)), T([12, 256, 257], f16, stride=(787968, 513, 1))), {})
cnt: 12, ((T([12, 3, 256, 256], f16, stride=(525312, 131328, 513, 1)), T([12, 3, 256, 256], f16, stride=(787968, 262656, 513, 1))), {})
cnt: 12, ((T([12, 255, 255], f16, stride=(525312, 513, 1)), T([12, 255, 255], f16, stride=(787968, 513, 1))), {})
cnt: 12, ((T([1, 3, 256, 257], f16, stride=(525312, 131328, 513, 1)), T([1, 3, 256, 257], f16, stride=(787968, 262656, 513, 1))), {})
cnt: 12, ((T([1, 256, 257], f16, stride=(525312, 513, 1)), T([1, 256, 257], f16, stride=(787968, 513, 1))), {})
cnt: 12, ((T([1, 3, 256, 256], f16, stride=(525312, 131328, 513, 1)), T([1, 3, 256, 256], f16, stride=(787968, 262656, 513, 1))), {})
cnt: 12, ((T([1, 255, 255], f16, stride=(525312, 513, 1)), T([1, 255, 255], f16, stride=(787968, 513, 1))), {})
cnt: 12, ((T([1024, 12, 513], f16, stride=(513, 525312, 1)), T([1024, 12, 513], f16)), {})
cnt: 84, ((T([12, 4, 256, 513], f16), T([12, 4, 256, 513], f16)), {})
cnt: 12, ((T([1, 1024, 12, 513], f16, stride=(6303744, 513, 525312, 1)), T([1, 1024, 12, 513], f16)), {})
cnt: 24, ((T([1, 256, 12, 257], f16, stride=(6303744, 513, 525312, 1)), T([1, 256, 12, 257], f16)), {})
cnt: 12, ((T([12, 255, 255], f16, stride=(525312, 513, 1)), T([12, 255, 255], f16)), {})
cnt: 12, ((T([12, 3, 256, 256], f16, stride=(525312, 131328, 513, 1)), T([12, 3, 256, 256], f16)), {})
cnt: 12, ((T([12, 256, 257], f16, stride=(525312, 513, 1)), T([12, 256, 257], f16)), {})
cnt: 24, ((T([1024, 768], f16), T([1024, 768], f16)), {})
cnt: 12, ((T([1024, 1, 768], f16), T([1024, 1, 768], f16)), {})
Operator: aten.cumsum.default
cnt: 1, ((T([1, 1024], i32), 1), {})
Operator: aten.div.Tensor
cnt: 12, ((T([1024, 1, 768], f16), 8.0), {})
Operator: aten.div_.Tensor
cnt: 12, ((T([1024, 1, 768], f16), 8.0), {})
Operator: aten.embedding.default
cnt: 1, ((T([50265, 768], f16), T([1, 1024], i64), 1), {})
cnt: 1, ((T([4098, 768], f16), T([1, 1024], i64), 1), {})
cnt: 1, ((T([1, 768], f16), T([1, 1024], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 1024, 768], f16), T([1, 1024], i64), 1, -1, False), {})
cnt: 1, ((T([1, 1024, 768], f16), T([1, 1024], i64), 4098, 1, False), {})
cnt: 1, ((T([1, 1024, 768], f16), T([1, 1024], i64), 50265, 1, False), {})
Operator: aten.eq.Scalar
cnt: 24, ((T([1, 256, 12, 257], f16, stride=(65792, 257, 0, 1)), 1), {})
cnt: 24, ((T([1, 256, 1, 257], f16), 1), {})
Operator: aten.flip.default
cnt: 24, ((T([256, 257], f16), [0]), {})
cnt: 24, ((T([1, 256, 1, 257], f16), [1, 3]), {})
Operator: aten.gelu.default
cnt: 12, ((T([1, 1024, 3072], f16),), {})
cnt: 1, ((T([1, 1024, 768], f16),), {})
Operator: aten.gelu_backward.default
cnt: 1, ((T([1, 1024, 768], f16), T([1, 1024, 768], f16)), {})
cnt: 12, ((T([1, 1024, 3072], f16), T([1, 1024, 3072], f16)), {})
Operator: aten.gt.Scalar
cnt: 1, ((T([1, 1024], f16), 0), {})
Operator: aten.index_add_.default
cnt: 12, ((T([1179648], f16), 0, T([2359296], i64), T([2359296], f16)), {})
cnt: 24, ((T([786432], f16), 0, T([1179648], i64), T([1179648], f16)), {})
Operator: aten.lt.Scalar
cnt: 1, ((T([1, 1024], f16), 0), {})
Operator: aten.masked_fill.Scalar
cnt: 12, ((T([1, 1024, 1, 1], f16), T([1, 1024, 1, 1], b8), -65504.0), {})
cnt: 12, ((T([1, 1024, 12, 513], f32), T([1, 1024, 1, 1], b8), 0.0), {})
cnt: 12, ((T([1, 1024, 12, 513], f32, stride=(6303744, 513, 525312, 1)), T([1, 1024, 1, 1], b8), 0), {})
cnt: 24, ((T([1, 256, 12, 257], f16), T([1, 256, 12, 257], b8), 0), {})
Operator: aten.masked_fill_.Scalar
cnt: 24, ((T([1, 256, 12, 257], f16, stride=(6303744, 513, 525312, 1)), T([1, 256, 12, 257], b8), -inf), {})
cnt: 24, ((T([1, 256, 1, 257], f16, stride=(525312, 513, 525312, 1)), T([1, 256, 1, 257], b8), -inf), {})
Operator: aten.mm.default
cnt: 1, ((T([1024, 50265], f16), T([50265, 768], f16)), {})
cnt: 1, ((T([50265, 1024], f16, stride=(1, 50265)), T([1024, 768], f16)), {})
cnt: 49, ((T([1024, 768], f16), T([768, 768], f16)), {})
cnt: 49, ((T([768, 1024], f16, stride=(1, 768)), T([1024, 768], f16)), {})
cnt: 12, ((T([1024, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 1024], f16, stride=(1, 768)), T([1024, 3072], f16)), {})
cnt: 12, ((T([1024, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 1024], f16, stride=(1, 3072)), T([1024, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([1, 1, 1, 1024], f16), -65504.0), {})
cnt: 1, ((T([1, 1024], i32), T([1, 1024], i32)), {})
cnt: 12, ((T([1, 3, 512, 1], f16, stride=(1024, 256, 1, 1)), T([1, 3, 1, 512], f16, stride=(1024, 256, 1, 1))), {})
Operator: aten.native_layer_norm.default
cnt: 26, ((T([1, 1024, 768], f16), [768], T([768], f16), T([768], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 26, ((T([1, 1024, 768], f16), T([1, 1024, 768], f16), [768], T([1, 1024, 1], f32), T([1, 1024, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.ne.Scalar
cnt: 1, ((T([1, 1024], i64), 1), {})
cnt: 12, ((T([1, 1024], f16), 0), {})
Operator: aten.new_empty.default
cnt: 12, ((T([12, 3, 512, 513], f16), [12, 4, 256, 513]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda', 'pin_memory': False})
cnt: 12, ((T([1, 3, 512, 513], f16), [1, 4, 256, 513]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda', 'pin_memory': False})
Operator: aten.new_empty_strided.default
cnt: 84, ((T([12, 4, 256, 513], f16), [12, 4, 256, 513], [525312, 131328, 513, 1]), {})
cnt: 12, ((T([1024, 768], f16), [1024, 768], [768, 1]), {})
Operator: aten.new_ones.default
cnt: 12, ((T([1, 1024, 12, 513], f16, stride=(6303744, 513, 525312, 1)), [256, 257]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda', 'pin_memory': False})
cnt: 12, ((T([1, 1024, 1, 1], f16), [1, 1024, 1, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda', 'pin_memory': False})
cnt: 12, ((T([1, 1024, 1, 513], f16), [256, 257]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda', 'pin_memory': False})
Operator: aten.new_zeros.default
cnt: 12, ((T([12, 4, 768, 64], f16), [1179648]), {})
cnt: 12, ((T([1024, 12, 513], f16), [6303744]), {})
cnt: 12, ((T([12, 3, 512, 64], f16, stride=(98304, 32768, 1, 512)), [786432]), {})
cnt: 12, ((T([12, 3, 512, 64], f16), [786432]), {})
cnt: 12, ((T([1024, 768], f16), [786432]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([1024, 50265], f16), T([1024], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([1024, 50265], f16), T([1024], i64), None, 1, -100), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([1, 1, 1, 1024], f16), 1.0), {})
Operator: aten.select_backward.default
cnt: 12, ((T([12, 512, 513], f16), [12, 3, 512, 513], 1, 0), {})
cnt: 12, ((T([12, 512, 513], f16), [12, 3, 512, 513], 1, -1), {})
Operator: aten.slice_backward.default
cnt: 12, ((T([12, 4, 256, 768], f16), [12, 4, 256, 769], 3, 0, -1, 1), {})
cnt: 12, ((T([12, 4, 256, 769], f16), [12, 4, 256, 769], 2, 0, 9223372036854775807, 1), {})
cnt: 12, ((T([12, 4, 256, 769], f16), [12, 4, 256, 769], 1, 0, 9223372036854775807, 1), {})
cnt: 12, ((T([12, 4, 256, 769], f16), [12, 4, 256, 769], 0, 0, 9223372036854775807, 1), {})
cnt: 12, ((T([12, 4, 196864], f16), [12, 4, 197120], 2, 0, -256, 1), {})
cnt: 12, ((T([12, 4, 197120], f16), [12, 4, 197120], 1, 0, 9223372036854775807, 1), {})
cnt: 12, ((T([12, 4, 197120], f16), [12, 4, 197120], 0, 0, 9223372036854775807, 1), {})
cnt: 12, ((T([12, 255, 255], f16), [12, 255, 513], 2, -255, 9223372036854775807, 1), {})
cnt: 12, ((T([12, 255, 513], f16), [12, 512, 513], 1, 0, 255, 1), {})
cnt: 48, ((T([12, 3, 512, 513], f16), [12, 3, 512, 513], 0, 0, 9223372036854775807, 1), {})
cnt: 12, ((T([12, 3, 256, 256], f16), [12, 3, 256, 513], 3, 257, 9223372036854775807, 1), {})
cnt: 12, ((T([12, 3, 256, 513], f16), [12, 3, 512, 513], 2, -257, -1, 1), {})
cnt: 24, ((T([12, 3, 512, 513], f16), [12, 3, 512, 513], 1, 0, 9223372036854775807, 1), {})
cnt: 12, ((T([12, 256, 257], f16), [12, 256, 513], 2, 0, 257, 1), {})
cnt: 12, ((T([12, 256, 513], f16), [12, 512, 513], 1, 256, 9223372036854775807, 1), {})
cnt: 12, ((T([12, 3, 256, 257], f16), [12, 3, 256, 513], 3, 0, 257, 1), {})
cnt: 12, ((T([12, 3, 256, 513], f16), [12, 3, 512, 513], 2, 0, 256, 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([1024, 50265], f16), [0], True), {})
cnt: 61, ((T([1024, 768], f16), [0], True), {})
cnt: 12, ((T([1024, 3072], f16), [0], True), {})
Operator: aten.tril.default
cnt: 24, ((T([256, 257], f16),), {})

View File

@ -0,0 +1,73 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([4096, 50265], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([4096, 50265], f16), T([4096, 50265], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([64, 1024, 1024], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([64, 1024, 1024], f16), T([64, 1024, 1024], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([1024, 1024], f32),), {'dtype': f16})
cnt: 1, ((T([4, 1, 1024, 1024], f16, stride=(0, 1048576, 1024, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 36, ((T([4, 1024, 16, 64], f16), [4, 1024, 1024]), {})
cnt: 1, ((T([4096, 50265], f16), [4, 1024, 50265]), {})
cnt: 12, ((T([4, 16, 1024, 64], f16), [64, 1024, 64]), {})
cnt: 12, ((T([4, 1024, 1024], f16), [4096, 1024]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([1024], i64), 1), {})
cnt: 1, ((T([4, 1024], i64, stride=(0, 1)), 2), {})
cnt: 73, ((T([4, 1024, 1024], f16), T([4, 1024, 1024], f16)), {})
cnt: 12, ((T([4, 16, 1024, 1024], f16), T([4, 1, 1024, 1024], f16)), {})
cnt: 1, ((T([50265, 1024], f16), T([50265, 1024], f16)), {})
Operator: aten.addmm.default
cnt: 48, ((T([1024], f16), T([4096, 1024], f16), T([1024, 1024], f16, stride=(1, 1024))), {})
cnt: 12, ((T([4096], f16), T([4096, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 12, ((T([1024], f16), T([4096, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
Operator: aten.bmm.default
cnt: 24, ((T([64, 1024, 64], f16), T([64, 64, 1024], f16, stride=(65536, 1, 64))), {})
cnt: 24, ((T([64, 1024, 1024], f16), T([64, 1024, 64], f16)), {})
cnt: 12, ((T([64, 1024, 1024], f16, stride=(1048576, 1, 1024)), T([64, 1024, 64], f16)), {})
cnt: 12, ((T([64, 64, 1024], f16, stride=(65536, 1, 64)), T([64, 1024, 1024], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([4, 1024], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([4, 1024], i64), T([4, 1024], i64)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50265, 1024], f16), T([4, 1024], i64), 1), {})
cnt: 1, ((T([1026, 1024], f16), T([4, 1024], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([4, 1024, 1024], f16), T([4, 1024], i64), 1026, -1, False), {})
cnt: 1, ((T([4, 1024, 1024], f16), T([4, 1024], i64), 50265, 1, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([4, 1024, 4096], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([4, 1024, 4096], f16), T([4, 1024, 4096], f16)), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([1024], i64), T([1024, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([1024, 1024], f32), T([1024, 1024], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([4096, 1024], f16), T([1024, 50265], f16, stride=(1, 1024))), {})
cnt: 1, ((T([50265, 4096], f16, stride=(1, 50265)), T([4096, 1024], f16)), {})
cnt: 1, ((T([4096, 50265], f16), T([50265, 1024], f16)), {})
cnt: 12, ((T([4096, 1024], f16), T([1024, 4096], f16)), {})
cnt: 12, ((T([1024, 4096], f16, stride=(1, 1024)), T([4096, 4096], f16)), {})
cnt: 12, ((T([4096, 4096], f16), T([4096, 1024], f16)), {})
cnt: 12, ((T([4096, 4096], f16, stride=(1, 4096)), T([4096, 1024], f16)), {})
cnt: 48, ((T([4096, 1024], f16), T([1024, 1024], f16)), {})
cnt: 48, ((T([1024, 4096], f16, stride=(1, 1024)), T([4096, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([4, 1024, 1024], f16), 1.0), {})
cnt: 24, ((T([4, 1024, 1024], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 25, ((T([4, 1024, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 25, ((T([4, 1024, 1024], f16), T([4, 1024, 1024], f16), [1024], T([4, 1024, 1], f32), T([4, 1024, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([4096, 50265], f16), T([4096], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([4096, 50265], f16), T([4096], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 60, ((T([4096, 1024], f16), [0], True), {})
cnt: 12, ((T([4096, 4096], f16), [0], True), {})

View File

@ -0,0 +1,89 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([2048, 50265], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([2048, 50265], f16), T([2048, 50265], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 36, ((T([32, 1024, 1024], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 36, ((T([32, 1024, 1024], f16), T([32, 1024, 1024], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([1024, 1024], f32),), {'dtype': f16})
cnt: 1, ((T([2, 1, 1024, 1024], f16, stride=(0, 1048576, 1024, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 108, ((T([2, 1024, 16, 64], f16), [2, 1024, 1024]), {})
cnt: 1, ((T([2048, 50265], f16), [2, 1024, 50265]), {})
cnt: 36, ((T([2, 16, 1024, 64], f16), [32, 1024, 64]), {})
cnt: 36, ((T([2, 1024, 1024], f16), [2048, 1024]), {})
Operator: aten.add.Tensor
cnt: 2, ((T([2, 1024], i64, stride=(0, 1)), 2), {})
cnt: 193, ((T([2, 1024, 1024], f16), T([2, 1024, 1024], f16)), {})
cnt: 1, ((T([1024], i64), 1), {})
cnt: 12, ((T([2, 16, 1024, 1024], f16), T([2, 1, 1024, 1024], f16)), {})
cnt: 1, ((T([2, 1024, 50265], f16), T([1, 50265], f16)), {})
cnt: 2, ((T([50265, 1024], f16), T([50265, 1024], f16)), {})
Operator: aten.addmm.default
cnt: 144, ((T([1024], f16), T([2048, 1024], f16), T([1024, 1024], f16, stride=(1, 1024))), {})
cnt: 24, ((T([4096], f16), T([2048, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 24, ((T([1024], f16), T([2048, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
Operator: aten.any.default
cnt: 24, ((T([2, 1024, 1024], b8),), {})
Operator: aten.bmm.default
cnt: 72, ((T([32, 1024, 64], f16), T([32, 64, 1024], f16, stride=(65536, 1, 64))), {})
cnt: 72, ((T([32, 1024, 1024], f16), T([32, 1024, 64], f16)), {})
cnt: 36, ((T([32, 1024, 1024], f16, stride=(1048576, 1, 1024)), T([32, 1024, 64], f16)), {})
cnt: 36, ((T([32, 64, 1024], f16, stride=(65536, 1, 64)), T([32, 1024, 1024], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([2, 1024], i64),), {})
cnt: 1, ((T([2, 1023], i64, stride=(1024, 1)),), {})
Operator: aten.copy_.default
cnt: 2, ((T([2, 1024], i64), T([2, 1024], i64)), {})
cnt: 1, ((T([2, 1023], i64, stride=(1024, 1)), T([2, 1023], i64)), {})
Operator: aten.embedding.default
cnt: 2, ((T([50265, 1024], f16), T([2, 1024], i64), 1), {})
cnt: 2, ((T([1026, 1024], f16), T([2, 1024], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 2, ((T([2, 1024, 1024], f16), T([2, 1024], i64), 1026, -1, False), {})
cnt: 2, ((T([2, 1024, 1024], f16), T([2, 1024], i64), 50265, 1, False), {})
Operator: aten.eq.Scalar
cnt: 1, ((T([2, 1024], i64), -100), {})
Operator: aten.fill_.Tensor
cnt: 1, ((T([2], i64, stride=(1024,)), T([], i64)), {})
Operator: aten.gelu.default
cnt: 24, ((T([2, 1024, 4096], f16),), {})
Operator: aten.gelu_backward.default
cnt: 24, ((T([2, 1024, 4096], f16), T([2, 1024, 4096], f16)), {})
Operator: aten.isinf.default
cnt: 12, ((T([2, 1024, 1024], f16),), {})
Operator: aten.isnan.default
cnt: 12, ((T([2, 1024, 1024], f16),), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([1024], i64), T([1024, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([2, 1024], i64), T([2, 1024], b8), 1), {})
cnt: 1, ((T([1024, 1024], f32), T([1024, 1024], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([2048, 1024], f16), T([1024, 50265], f16, stride=(1, 1024))), {})
cnt: 1, ((T([50265, 2048], f16, stride=(1, 50265)), T([2048, 1024], f16)), {})
cnt: 1, ((T([2048, 50265], f16), T([50265, 1024], f16)), {})
cnt: 24, ((T([2048, 1024], f16), T([1024, 4096], f16)), {})
cnt: 24, ((T([1024, 2048], f16, stride=(1, 1024)), T([2048, 4096], f16)), {})
cnt: 24, ((T([2048, 4096], f16), T([4096, 1024], f16)), {})
cnt: 24, ((T([4096, 2048], f16, stride=(1, 4096)), T([2048, 1024], f16)), {})
cnt: 144, ((T([2048, 1024], f16), T([1024, 1024], f16)), {})
cnt: 144, ((T([1024, 2048], f16, stride=(1, 1024)), T([2048, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 4, ((T([2, 1024, 1024], f16), 1.0), {})
cnt: 72, ((T([2, 1024, 1024], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 62, ((T([2, 1024, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 62, ((T([2, 1024, 1024], f16), T([2, 1024, 1024], f16), [1024], T([2, 1024, 1], f32), T([2, 1024, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
Operator: aten.new_zeros.default
cnt: 1, ((T([2, 1024], i64), [2, 1024]), {'dtype': i64, 'layout': torch.strided, 'device': 'cuda', 'pin_memory': False})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([2048, 50265], f16), T([2048], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([2048, 50265], f16), T([2048], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 168, ((T([2048, 1024], f16), [0], True), {})
cnt: 24, ((T([2048, 4096], f16), [0], True), {})

View File

@ -0,0 +1,81 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([8192, 30522], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([8192, 30522], f16), T([8192, 30522], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([64, 12, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([64, 12, 128, 128], f16), T([64, 12, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([64, 1, 1, 128], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 36, ((T([64, 12, 128, 64], f16), [768, 128, 64]), {})
cnt: 12, ((T([64, 12, 64, 128], f16), [768, 64, 128]), {})
cnt: 12, ((T([768, 128, 128], f16), [64, 12, 128, 128]), {})
cnt: 12, ((T([768, 128, 64], f16), [64, 12, 128, 64]), {})
cnt: 24, ((T([64, 128, 12, 64], f16), [64, 128, 768]), {})
cnt: 12, ((T([64, 128, 768], f16), [8192, 768]), {})
Operator: aten.add.Tensor
cnt: 73, ((T([64, 128, 768], f16), T([64, 128, 768], f16)), {})
cnt: 12, ((T([64, 12, 128, 128], f16), T([64, 1, 1, 128], f16)), {})
cnt: 1, ((T([30522, 768], f16), T([30522, 768], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([64, 128, 768], f16), T([1, 128, 768], f16)), {})
Operator: aten.addmm.default
cnt: 49, ((T([768], f16), T([8192, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([8192, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([8192, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([30522], f16), T([8192, 768], f16), T([768, 30522], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 12, ((T([768, 128, 64], f16), T([768, 64, 128], f16)), {})
cnt: 12, ((T([768, 128, 128], f16), T([768, 128, 64], f16)), {})
cnt: 12, ((T([768, 128, 128], f16, stride=(16384, 1, 128)), T([768, 128, 64], f16)), {})
cnt: 12, ((T([768, 128, 64], f16), T([768, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 12, ((T([768, 64, 128], f16, stride=(8192, 1, 64)), T([768, 128, 128], f16)), {})
cnt: 12, ((T([768, 128, 128], f16), T([768, 128, 64], f16, stride=(8192, 1, 128))), {})
Operator: aten.clone.default
cnt: 2, ((T([64, 128], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([64, 128], i64), T([64, 128], i64)), {})
Operator: aten.div.Tensor
cnt: 24, ((T([64, 12, 128, 128], f16), 8.0), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 768], f16), T([64, 128], i64), 0), {})
cnt: 1, ((T([2, 768], f16), T([64, 128], i64, stride=(0, 1))), {})
cnt: 1, ((T([512, 768], f16), T([1, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 128, 768], f16), T([1, 128], i64), 512, -1, False), {})
cnt: 1, ((T([64, 128, 768], f16), T([64, 128], i64, stride=(0, 1)), 2, -1, False), {})
cnt: 1, ((T([64, 128, 768], f16), T([64, 128], i64), 30522, 0, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([64, 128, 3072], f16),), {})
cnt: 1, ((T([64, 128, 768], f16),), {})
Operator: aten.gelu_backward.default
cnt: 1, ((T([64, 128, 768], f16), T([64, 128, 768], f16)), {})
cnt: 12, ((T([64, 128, 3072], f16), T([64, 128, 3072], f16)), {})
Operator: aten.mm.default
cnt: 1, ((T([8192, 30522], f16), T([30522, 768], f16)), {})
cnt: 1, ((T([30522, 8192], f16, stride=(1, 30522)), T([8192, 768], f16)), {})
cnt: 49, ((T([8192, 768], f16), T([768, 768], f16)), {})
cnt: 49, ((T([768, 8192], f16, stride=(1, 768)), T([8192, 768], f16)), {})
cnt: 12, ((T([8192, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 8192], f16, stride=(1, 768)), T([8192, 3072], f16)), {})
cnt: 12, ((T([8192, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 8192], f16, stride=(1, 3072)), T([8192, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([64, 1, 1, 128], f16), -65504.0), {})
Operator: aten.native_layer_norm.default
cnt: 26, ((T([64, 128, 768], f16), [768], T([768], f16), T([768], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 26, ((T([64, 128, 768], f16), T([64, 128, 768], f16), [768], T([64, 128, 1], f32), T([64, 128, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([8192, 30522], f16), T([8192], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([8192, 30522], f16), T([8192], i64), None, 1, -100), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([64, 1, 1, 128], f16), 1.0), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([8192, 30522], f16), [0], True), {})
cnt: 61, ((T([8192, 768], f16), [0], True), {})
cnt: 12, ((T([8192, 3072], f16), [0], True), {})
cnt: 1, ((T([64, 128, 768], f16), [0], True), {})

View File

@ -0,0 +1,88 @@
Operator: aten._log_softmax.default
cnt: 2, ((T([64, 128], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 2, ((T([64, 128], f16), T([64, 128], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([64, 12, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([64, 12, 128, 128], f16), T([64, 12, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([64, 1, 1, 128], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 36, ((T([64, 12, 128, 64], f16), [768, 128, 64]), {})
cnt: 12, ((T([64, 12, 64, 128], f16), [768, 64, 128]), {})
cnt: 12, ((T([768, 128, 128], f16), [64, 12, 128, 128]), {})
cnt: 12, ((T([768, 128, 64], f16), [64, 12, 128, 64]), {})
cnt: 24, ((T([64, 128, 12, 64], f16), [64, 128, 768]), {})
cnt: 12, ((T([64, 128, 768], f16), [8192, 768]), {})
Operator: aten.add.Tensor
cnt: 73, ((T([64, 128, 768], f16), T([64, 128, 768], f16)), {})
cnt: 12, ((T([64, 12, 128, 128], f16), T([64, 1, 1, 128], f16)), {})
cnt: 1, ((T([], f16), T([], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([64, 128, 768], f16), T([1, 128, 768], f16)), {})
Operator: aten.addmm.default
cnt: 48, ((T([768], f16), T([8192, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([8192, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([8192, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([2], f16), T([8192, 768], f16), T([768, 2], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 12, ((T([768, 128, 64], f16), T([768, 64, 128], f16)), {})
cnt: 12, ((T([768, 128, 128], f16), T([768, 128, 64], f16)), {})
cnt: 12, ((T([768, 128, 128], f16, stride=(16384, 1, 128)), T([768, 128, 64], f16)), {})
cnt: 12, ((T([768, 128, 64], f16), T([768, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 12, ((T([768, 64, 128], f16, stride=(8192, 1, 64)), T([768, 128, 128], f16)), {})
cnt: 12, ((T([768, 128, 128], f16), T([768, 128, 64], f16, stride=(8192, 1, 128))), {})
Operator: aten.cat.default
cnt: 1, (([T([64, 128, 1], f16), T([64, 128, 1], f16)], 2), {})
Operator: aten.clamp.default
cnt: 2, ((T([64], i64), 0, 128), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 128], i64),), {})
cnt: 2, ((T([64], i64),), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 128], i64), T([64, 128], i64)), {})
cnt: 2, ((T([64], i64), T([64], i64)), {})
Operator: aten.div.Tensor
cnt: 24, ((T([64, 12, 128, 128], f16), 8.0), {})
cnt: 2, ((T([], f16), 2), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 768], f16), T([64, 128], i64), 0), {})
cnt: 1, ((T([2, 768], f16), T([64, 128], i64, stride=(0, 1))), {})
cnt: 1, ((T([512, 768], f16), T([1, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 128, 768], f16), T([1, 128], i64), 512, -1, False), {})
cnt: 1, ((T([64, 128, 768], f16), T([64, 128], i64, stride=(0, 1)), 2, -1, False), {})
cnt: 1, ((T([64, 128, 768], f16), T([64, 128], i64), 30522, 0, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([64, 128, 3072], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([64, 128, 3072], f16), T([64, 128, 3072], f16)), {})
Operator: aten.mm.default
cnt: 1, ((T([8192, 2], f16), T([2, 768], f16)), {})
cnt: 1, ((T([2, 8192], f16, stride=(1, 2)), T([8192, 768], f16)), {})
cnt: 12, ((T([8192, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 8192], f16, stride=(1, 768)), T([8192, 3072], f16)), {})
cnt: 12, ((T([8192, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 8192], f16, stride=(1, 3072)), T([8192, 768], f16)), {})
cnt: 48, ((T([8192, 768], f16), T([768, 768], f16)), {})
cnt: 48, ((T([768, 8192], f16, stride=(1, 768)), T([8192, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([64, 1, 1, 128], f16), -65504.0), {})
Operator: aten.native_layer_norm.default
cnt: 25, ((T([64, 128, 768], f16), [768], T([768], f16), T([768], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 25, ((T([64, 128, 768], f16), T([64, 128, 768], f16), [768], T([64, 128, 1], f32), T([64, 128, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 2, ((T([], f16), T([64, 128], f16), T([64], i64), None, 1, 128, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 2, ((T([64, 128], f16), T([64], i64), None, 1, 128), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([64, 1, 1, 128], f16), 1.0), {})
Operator: aten.split.Tensor
cnt: 1, ((T([64, 128, 2], f16), 1, -1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([8192, 2], f16), [0], True), {})
cnt: 60, ((T([8192, 768], f16), [0], True), {})
cnt: 12, ((T([8192, 3072], f16), [0], True), {})
cnt: 1, ((T([64, 128, 768], f16), [0], True), {})

View File

@ -0,0 +1,237 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([1024, 50358], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([1024, 50358], f16), T([1024, 50358], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 24, ((T([1, 12, 64, 1024], f16), -1, False), {})
cnt: 24, ((T([1, 12, 64, 448], f16), -1, False), {})
cnt: 12, ((T([1, 12, 12, 64, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 24, ((T([1, 12, 64, 1024], f16), T([1, 12, 64, 1024], f16), -1, f16), {})
cnt: 24, ((T([1, 12, 64, 448], f16), T([1, 12, 64, 448], f16), -1, f16), {})
cnt: 12, ((T([1, 12, 12, 64, 512], f16), T([1, 12, 12, 64, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 12, ((T([1, 1, 12, 64, 192], f32),), {'dtype': f16})
cnt: 12, ((T([1, 1, 1024, 1], f32),), {'dtype': f16})
cnt: 12, ((T([1, 1, 1, 1024], f32),), {'dtype': f16})
cnt: 12, ((T([12, 14, 3], i32),), {'dtype': i64, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 24, ((T([1, 12, 16, 64, 64], f16), [192, 64, 64]), {})
cnt: 24, ((T([1, 12, 12, 64, 64], f16), [144, 64, 64]), {})
cnt: 24, ((T([1, 12, 12, 192, 64], f16), [144, 192, 64]), {})
cnt: 24, ((T([1, 1024, 12, 64], f16), [1, 1024, 768]), {})
Operator: aten.add.Tensor
cnt: 76, ((T([1, 1024, 768], f16), T([1, 1024, 768], f16)), {})
cnt: 24, ((T([504], i64), T([504], i64)), {})
cnt: 36, ((T([1, 1024, 3072], f16), T([1, 1024, 3072], f16)), {})
cnt: 12, ((T([1, 1024, 3072], f16), 1.0), {})
cnt: 1, ((T([1, 1024, 768], f16), 1.0), {})
cnt: 360, ((T([1, 12, 16, 64, 64], f16), T([1, 12, 16, 64, 64], f16)), {})
cnt: 36, ((T([1, 12, 12, 64, 512], f16), T([1, 12, 12, 64, 512], f16)), {})
cnt: 48, ((T([1, 12, 14, 192, 64], f16), T([1, 12, 14, 192, 64], f16)), {})
cnt: 36, ((T([1, 12, 12, 64, 64], f16), T([1, 12, 12, 64, 64], f16)), {})
cnt: 24, ((T([1, 12, 1024, 64], f16), T([1, 12, 1024, 64], f16)), {})
cnt: 12, ((T([1, 12, 1024, 64], f16, stride=(786432, 65536, 1, 1024)), T([1, 12, 1024, 64], f16, stride=(786432, 65536, 1, 1024))), {})
cnt: 12, ((T([1, 12, 1024, 64], f16, stride=(786432, 65536, 1, 1024)), T([1, 12, 1024, 64], f16)), {})
cnt: 1, ((T([50358, 768], f16), T([50358, 768], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([1, 1024, 768], f16), T([1, 1024, 768], f16)), {})
cnt: 24, ((T([1, 12, 64, 1024], f16), T([1, 1, 1, 1024], f16)), {})
cnt: 24, ((T([1, 12, 64, 448], f16), T([1, 12, 64, 448], f32)), {})
cnt: 12, ((T([1, 12, 12, 64, 192], f16), T([1, 1, 12, 64, 192], f16)), {})
cnt: 24, ((T([1, 12, 12, 64, 64], f16), T([1, 1, 1, 1, 64], f16)), {})
cnt: 12, ((T([1, 12, 12, 64, 192], f16), T([1, 12, 12, 64, 192], f32)), {})
cnt: 36, ((T([1, 12, 12, 64, 64], f16), T([1, 12, 12, 64, 64], f16)), {})
Operator: aten.addmm.default
cnt: 49, ((T([768], f16), T([1024, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([1024, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([1024, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([768], f16), T([1, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 1, ((T([50358], f16), T([1024, 768], f16), T([768, 50358], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 48, ((T([12, 64, 64], f16, stride=(64, 768, 1)), T([12, 64, 1024], f16, stride=(64, 1, 768))), {})
cnt: 48, ((T([12, 64, 1024], f16), T([12, 1024, 64], f16, stride=(64, 768, 1))), {})
cnt: 48, ((T([12, 64, 64], f16, stride=(64, 768, 1)), T([12, 64, 448], f16, stride=(28672, 1, 64))), {})
cnt: 48, ((T([12, 64, 448], f16), T([12, 448, 64], f16)), {})
cnt: 48, ((T([144, 64, 64], f16), T([144, 64, 192], f16, stride=(12288, 1, 64))), {})
cnt: 24, ((T([12, 768, 64], f16, stride=(64, 768, 1)), T([12, 64, 64], f16, stride=(64, 1, 768))), {})
cnt: 24, ((T([144, 64, 192], f16, stride=(32768, 512, 1)), T([144, 192, 64], f16)), {})
cnt: 24, ((T([12, 768, 64], f16, stride=(393216, 512, 1)), T([12, 64, 64], f16, stride=(64, 768, 1))), {})
cnt: 24, ((T([12, 1024, 64], f16, stride=(65536, 1, 1024)), T([12, 64, 64], f16, stride=(64, 768, 1))), {})
cnt: 24, ((T([12, 64, 64], f16, stride=(64, 1, 768)), T([12, 64, 1024], f16)), {})
cnt: 24, ((T([12, 448, 64], f16, stride=(28672, 1, 448)), T([12, 64, 64], f16, stride=(64, 768, 1))), {})
cnt: 24, ((T([12, 64, 64], f16, stride=(64, 1, 768)), T([12, 64, 448], f16)), {})
cnt: 24, ((T([12, 64, 768], f16, stride=(393216, 1, 512)), T([12, 768, 64], f16)), {})
cnt: 24, ((T([12, 768, 64], f16), T([12, 64, 64], f16, stride=(64, 1, 768))), {})
cnt: 24, ((T([144, 192, 64], f16, stride=(32768, 1, 512)), T([144, 64, 64], f16)), {})
cnt: 24, ((T([12, 64, 768], f16, stride=(64, 1, 768)), T([12, 768, 64], f16)), {})
cnt: 24, ((T([12, 768, 64], f16), T([12, 64, 64], f16, stride=(64, 768, 1))), {})
cnt: 24, ((T([144, 64, 64], f16, stride=(4096, 1, 64)), T([144, 64, 192], f16)), {})
cnt: 24, ((T([144, 64, 192], f16), T([144, 192, 64], f16)), {})
Operator: aten.cat.default
cnt: 1, (([T([1, 12, 64], f32), T([1, 12, 64], f32), T([1, 12, 64], f32)], 2), {})
cnt: 12, (([T([1, 12, 14, 3], i64)],), {})
cnt: 48, (([T([1, 12, 64, 64], f16, stride=(768, 64, 768, 1)), T([1, 12, 64, 64], f16, stride=(768, 64, 768, 1)), T([1, 12, 64, 64], f16, stride=(768, 64, 768, 1)), T([1, 12, 64, 64], f16, stride=(768, 64, 768, 1)), T([1, 12, 192, 64], f16, stride=(2064384, 172032, 64, 1))], 2), {})
cnt: 12, (([T([1, 1, 1, 192], f16), T([1, 1, 1, 64], f16), T([1, 1, 1, 192], f16)], 3), {})
cnt: 24, (([T([1, 12, 64, 256], f32), T([1, 12, 64, 192], f32, stride=(2064384, 172032, 192, 1))], 3), {})
cnt: 24, (([T([1, 12, 12, 64, 64], f16, stride=(768, 64, 49152, 768, 1)), T([1, 12, 12, 64, 64], f16, stride=(768, 64, 49152, 768, 1)), T([1, 12, 12, 64, 64], f16, stride=(768, 64, 49152, 768, 1))], 3), {})
cnt: 12, (([T([1, 12, 12, 64, 64], f16), T([1, 12, 12, 64, 192], f16), T([1, 12, 12, 64, 192], f16), T([1, 12, 12, 64, 64], f16)], -1), {})
cnt: 12, (([T([1, 1, 1, 64], f16), T([1, 1, 1, 192], f16), T([1, 1, 1, 192], f16)], 3), {})
cnt: 12, (([T([1, 12, 1, 64, 64], f16), T([1, 12, 1, 64, 64], f16), T([1, 12, 12, 64, 64], f16), T([1, 12, 1, 64, 64], f16), T([1, 12, 1, 64, 64], f16)], 2), {})
Operator: aten.clone.default
cnt: 2, ((T([1, 1024], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([1, 1024], i64), T([1, 1024], i64)), {})
cnt: 12, ((T([12, 12, 64, 64], f16), T([12, 12, 64, 64], f16, stride=(64, 49152, 768, 1))), {})
cnt: 36, ((T([144, 64, 64], f16), T([144, 64, 64], f16)), {})
cnt: 36, ((T([1, 12, 12, 64, 64], f16), T([1, 12, 12, 64, 64], f16)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50358, 768], f16), T([1, 1024], i64), 0), {})
cnt: 1, ((T([2, 768], f16), T([1, 1024], i64)), {})
cnt: 1, ((T([4096, 768], f16), T([1, 1024], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 1024, 768], f16), T([1, 1024], i64), 4096, -1, False), {})
cnt: 1, ((T([1, 1024, 768], f16), T([1, 1024], i64), 2, -1, False), {})
cnt: 1, ((T([1, 1024, 768], f16), T([1, 1024], i64), 50358, 0, False), {})
Operator: aten.floor_divide.default
cnt: 24, ((T([504], i64), 42), {})
Operator: aten.index.Tensor
cnt: 12, ((T([16, 64], f32), [T([504], i64)]), {})
Operator: aten.index_add.default
cnt: 24, ((T([192, 64, 64], f16), 0, T([504], i64), T([504, 64, 64], f16)), {})
Operator: aten.index_select.default
cnt: 24, ((T([192, 64, 64], f16), 0, T([504], i64)), {})
Operator: aten.minimum.default
cnt: 24, ((T([1, 1, 1, 448], f16), T([1, 12, 64, 448], f32)), {})
Operator: aten.mm.default
cnt: 1, ((T([1024, 50358], f16), T([50358, 768], f16)), {})
cnt: 1, ((T([50358, 1024], f16, stride=(1, 50358)), T([1024, 768], f16)), {})
cnt: 37, ((T([1024, 768], f16), T([768, 768], f16)), {})
cnt: 37, ((T([768, 1024], f16, stride=(1, 768)), T([1024, 768], f16)), {})
cnt: 12, ((T([1024, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 1024], f16, stride=(1, 768)), T([1024, 3072], f16)), {})
cnt: 12, ((T([1024, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 1024], f16, stride=(1, 3072)), T([1024, 768], f16)), {})
cnt: 12, ((T([1024, 768], f16, stride=(1, 1024)), T([768, 768], f16)), {})
cnt: 12, ((T([768, 1024], f16), T([1024, 768], f16)), {})
Operator: aten.mul.Scalar
cnt: 1, ((T([1, 1024, 768], f16), 3.0), {})
cnt: 12, ((T([1, 1024, 3072], f16), 3.0), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([1, 12, 64, 1], f32), T([1, 12, 1, 192], f32)), {})
cnt: 12, ((T([1, 1, 14, 64, 1], f32), T([1, 12, 14, 1, 192], f32)), {})
cnt: 24, ((T([504], i64), 16), {})
cnt: 48, ((T([1, 12, 64, 1024], f16), 0.125), {})
cnt: 24, ((T([1, 1, 1, 1024], f16), -10000.0), {})
cnt: 48, ((T([1, 12, 64, 448], f16), 0.125), {})
cnt: 24, ((T([1, 12, 64, 448], f32), -10000.0), {})
cnt: 24, ((T([1, 12, 12, 64, 192], f16), 0.125), {})
cnt: 24, ((T([1, 12, 12, 64, 64], f16), 0.125), {})
cnt: 12, ((T([1, 1, 12, 64, 192], f16), -10000.0), {})
cnt: 24, ((T([1, 1, 1, 1, 64], f16), -10000.0), {})
cnt: 12, ((T([1, 12, 12, 64, 192], f32), -10000.0), {})
cnt: 12, ((T([1, 12, 1024, 64], f16), T([1, 1, 1024, 1], f16)), {})
cnt: 24, ((T([1, 1024, 3072], f16), 0.5), {})
cnt: 24, ((T([1, 1024, 3072], f16), 0.044715), {})
cnt: 24, ((T([1, 1024, 3072], f16), 0.7978845608028654), {})
cnt: 48, ((T([1, 1024, 3072], f16), T([1, 1024, 3072], f16)), {})
cnt: 2, ((T([1, 1024, 768], f16), 0.5), {})
cnt: 2, ((T([1, 1024, 768], f16), 0.044715), {})
cnt: 2, ((T([1, 1024, 768], f16), 0.7978845608028654), {})
cnt: 4, ((T([1, 1024, 768], f16), T([1, 1024, 768], f16)), {})
cnt: 12, ((T([1, 12, 1024, 64], f16, stride=(786432, 64, 768, 1)), T([1, 1, 1024, 1], f16)), {})
cnt: 24, ((T([1, 12, 12, 64, 64], f16, stride=(4718592, 393216, 32768, 512, 1)), 0.125), {})
cnt: 24, ((T([1, 12, 12, 64, 192], f16, stride=(4718592, 393216, 32768, 512, 1)), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 26, ((T([1, 1024, 768], f16), [768], T([768], f16), T([768], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 26, ((T([1, 1024, 768], f16), T([1, 1024, 768], f16), [768], T([1, 1024, 1], f32), T([1, 1024, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.new_empty_strided.default
cnt: 36, ((T([144, 64, 64], f16), [144, 64, 64], [4096, 64, 1]), {})
Operator: aten.new_ones.default
cnt: 24, ((T([1, 1, 1, 1024], f16), [1, 1, 1, 192]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda', 'pin_memory': False})
cnt: 24, ((T([1, 12, 14, 64, 192], f32), [1, 12, 64, 256]), {'dtype': f32, 'layout': torch.strided, 'device': 'cuda', 'pin_memory': False})
Operator: aten.new_zeros.default
cnt: 12, ((T([12, 12, 64, 64], f16, stride=(64, 49152, 768, 1)), [589824]), {})
cnt: 24, ((T([504, 64, 64], f16), [192, 64, 64]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([1024, 50358], f16), T([1024], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([1024, 50358], f16), T([1024], i64), None, 1, -100), {})
Operator: aten.pow.Tensor_Scalar
cnt: 12, ((T([1, 1024, 3072], f16), 3.0), {})
cnt: 1, ((T([1, 1024, 768], f16), 3.0), {})
cnt: 1, ((T([1, 1024, 768], f16), 2.0), {})
cnt: 12, ((T([1, 1024, 3072], f16), 2.0), {})
Operator: aten.rsub.Scalar
cnt: 24, ((T([1, 1, 1, 1024], f16), 1.0), {})
cnt: 24, ((T([1, 12, 64, 448], f32), 1.0), {})
cnt: 12, ((T([1, 1, 12, 64, 192], f16), 1.0), {})
cnt: 24, ((T([1, 1, 1, 1, 64], f16), 1.0), {})
cnt: 12, ((T([1, 12, 12, 64, 192], f32, stride=(2064384, 172032, 12288, 192, 1)), 1.0), {})
Operator: aten.select_backward.default
cnt: 24, ((T([1, 12, 64, 64], f16), [1, 12, 16, 64, 64], 2, -1), {})
cnt: 12, ((T([1, 12, 64, 64], f16), [1, 12, 16, 64, 64], 2, -2), {})
cnt: 12, ((T([1, 12, 192, 64], f16, stride=(344064, 28672, 64, 1)), [1, 12, 14, 192, 64], 2, -1), {})
cnt: 24, ((T([1, 12, 64, 64], f16, stride=(344064, 28672, 64, 1)), [1, 12, 16, 64, 64], 2, -1), {})
cnt: 12, ((T([1, 12, 64, 64], f16, stride=(344064, 28672, 64, 1)), [1, 12, 16, 64, 64], 2, -2), {})
cnt: 12, ((T([1, 12, 64, 64], f16, stride=(344064, 28672, 64, 1)), [1, 12, 16, 64, 64], 2, -3), {})
cnt: 24, ((T([1, 12, 64, 64], f16, stride=(344064, 28672, 64, 1)), [1, 12, 16, 64, 64], 2, 0), {})
cnt: 12, ((T([1, 12, 192, 64], f16, stride=(344064, 28672, 1, 448)), [1, 12, 14, 192, 64], 2, -1), {})
cnt: 24, ((T([1, 12, 64, 64], f16, stride=(344064, 28672, 1, 448)), [1, 12, 16, 64, 64], 2, -1), {})
cnt: 12, ((T([1, 12, 64, 64], f16, stride=(344064, 28672, 1, 448)), [1, 12, 16, 64, 64], 2, -2), {})
cnt: 12, ((T([1, 12, 64, 64], f16, stride=(344064, 28672, 1, 448)), [1, 12, 16, 64, 64], 2, -3), {})
cnt: 24, ((T([1, 12, 64, 64], f16, stride=(344064, 28672, 1, 448)), [1, 12, 16, 64, 64], 2, 0), {})
cnt: 24, ((T([1, 12, 64, 64], f16), [1, 12, 16, 64, 64], 2, 0), {})
cnt: 12, ((T([1, 12, 64, 64], f16, stride=(64, 4096, 1, 64)), [1, 12, 16, 64, 64], 2, -1), {})
cnt: 12, ((T([1, 12, 64, 64], f16, stride=(64, 4096, 1, 64)), [1, 12, 16, 64, 64], 2, 0), {})
cnt: 12, ((T([1, 12, 64, 64], f16), [1, 12, 16, 64, 64], 2, 1), {})
cnt: 12, ((T([1, 12, 192, 64], f16, stride=(344064, 28672, 64, 1)), [1, 12, 14, 192, 64], 2, 0), {})
cnt: 12, ((T([1, 12, 64, 64], f16, stride=(344064, 28672, 64, 1)), [1, 12, 16, 64, 64], 2, 2), {})
cnt: 12, ((T([1, 12, 64, 64], f16, stride=(344064, 28672, 64, 1)), [1, 12, 16, 64, 64], 2, 1), {})
cnt: 12, ((T([1, 12, 192, 64], f16, stride=(344064, 28672, 1, 448)), [1, 12, 14, 192, 64], 2, 0), {})
cnt: 12, ((T([1, 12, 64, 64], f16, stride=(344064, 28672, 1, 448)), [1, 12, 16, 64, 64], 2, 2), {})
cnt: 12, ((T([1, 12, 64, 64], f16, stride=(344064, 28672, 1, 448)), [1, 12, 16, 64, 64], 2, 1), {})
Operator: aten.slice_backward.default
cnt: 372, ((T([1, 12, 16, 64, 64], f16), [1, 12, 16, 64, 64], 1, 0, 9223372036854775807, 1), {})
cnt: 372, ((T([1, 12, 16, 64, 64], f16), [1, 12, 16, 64, 64], 0, 0, 9223372036854775807, 1), {})
cnt: 72, ((T([1, 12, 14, 192, 64], f16), [1, 12, 14, 192, 64], 1, 0, 9223372036854775807, 1), {})
cnt: 72, ((T([1, 12, 14, 192, 64], f16), [1, 12, 14, 192, 64], 0, 0, 9223372036854775807, 1), {})
cnt: 12, ((T([1, 12, 12, 64, 64], f16), [1, 12, 12, 64, 512], 4, -64, 9223372036854775807, 1), {})
cnt: 48, ((T([1, 12, 12, 64, 512], f16), [1, 12, 12, 64, 512], 3, 0, 9223372036854775807, 1), {})
cnt: 48, ((T([1, 12, 12, 64, 512], f16), [1, 12, 12, 64, 512], 2, 0, 9223372036854775807, 1), {})
cnt: 48, ((T([1, 12, 12, 64, 512], f16), [1, 12, 12, 64, 512], 1, 0, 9223372036854775807, 1), {})
cnt: 48, ((T([1, 12, 12, 64, 512], f16), [1, 12, 12, 64, 512], 0, 0, 9223372036854775807, 1), {})
cnt: 12, ((T([1, 12, 12, 64, 64], f16), [1, 12, 12, 64, 512], 4, 0, 64, 1), {})
cnt: 12, ((T([1, 12, 12, 192, 64], f16), [1, 12, 14, 192, 64], 2, 1, -1, 1), {})
cnt: 12, ((T([1, 12, 12, 64, 192], f16), [1, 12, 12, 64, 512], 4, 256, -64, 1), {})
cnt: 12, ((T([1, 12, 12, 64, 192], f16), [1, 12, 12, 64, 512], 4, 64, 256, 1), {})
cnt: 12, ((T([1, 12, 12, 192, 64], f16, stride=(1769472, 147456, 12288, 1, 192)), [1, 12, 14, 192, 64], 2, 1, -1, 1), {})
cnt: 12, ((T([1, 12, 12, 64, 64], f16), [1, 12, 16, 64, 64], 2, 2, -2, 1), {})
cnt: 12, ((T([1, 12, 12, 64, 64], f16, stride=(1769472, 147456, 12288, 64, 1)), [1, 12, 16, 64, 64], 2, 3, -1, 1), {})
cnt: 12, ((T([1, 12, 12, 64, 64], f16, stride=(1769472, 147456, 12288, 64, 1)), [1, 12, 16, 64, 64], 2, 2, -2, 1), {})
cnt: 12, ((T([1, 12, 12, 64, 64], f16, stride=(1769472, 147456, 12288, 64, 1)), [1, 12, 16, 64, 64], 2, 1, -3, 1), {})
cnt: 12, ((T([1, 12, 12, 64, 64], f16, stride=(1769472, 147456, 12288, 1, 192)), [1, 12, 16, 64, 64], 2, 3, -1, 1), {})
cnt: 12, ((T([1, 12, 12, 64, 64], f16, stride=(1769472, 147456, 12288, 1, 192)), [1, 12, 16, 64, 64], 2, 2, -2, 1), {})
cnt: 12, ((T([1, 12, 12, 64, 64], f16, stride=(1769472, 147456, 12288, 1, 192)), [1, 12, 16, 64, 64], 2, 1, -3, 1), {})
Operator: aten.stack.default
cnt: 12, (([T([504, 64], f32)],), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([1024, 50358], f16), [0], True), {})
cnt: 49, ((T([1024, 768], f16), [0], True), {})
cnt: 12, ((T([1024, 3072], f16), [0], True), {})
cnt: 12, ((T([1024, 768], f16, stride=(1, 1024)), [0], True), {})
Operator: aten.tanh.default
cnt: 12, ((T([1, 1024, 3072], f16),), {})
cnt: 1, ((T([1, 768], f16),), {})
cnt: 1, ((T([1, 1024, 768], f16),), {})
Operator: aten.tanh_backward.default
cnt: 1, ((T([1, 1024, 768], f16), T([1, 1024, 768], f16)), {})
cnt: 12, ((T([1, 1024, 3072], f16), T([1, 1024, 3072], f16)), {})
Operator: aten.unbind.int
cnt: 12, ((T([1, 16, 64], f32),), {})
cnt: 12, ((T([1, 12, 14, 3], i64),), {})
Operator: aten.unsqueeze_.default
cnt: 1, ((T([1, 12, 64, 192], f32), 1), {})
cnt: 12, ((T([12, 14, 3], i64), 0), {})
cnt: 48, ((T([1, 12, 64, 64], f16), 2), {})

View File

@ -0,0 +1,74 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([8192, 50265], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([8192, 50265], f16), T([8192, 50265], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 8, ((T([1024, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 8, ((T([1024, 128, 128], f16), T([1024, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([64, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 24, ((T([64, 128, 16, 32], f16), [64, 128, 512]), {})
cnt: 1, ((T([8192, 50265], f16), [64, 128, 50265]), {})
cnt: 8, ((T([64, 16, 128, 32], f16), [1024, 128, 32]), {})
cnt: 8, ((T([64, 128, 512], f16), [8192, 512]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([128], i64), 1), {})
cnt: 1, ((T([64, 128, 512], f16), T([128, 512], f16)), {})
cnt: 8, ((T([64, 16, 128, 128], f16), T([64, 1, 128, 128], f16)), {})
cnt: 48, ((T([64, 128, 512], f16), T([64, 128, 512], f16)), {})
cnt: 1, ((T([50265, 512], f16), T([50265, 512], f16)), {})
Operator: aten.addmm.default
cnt: 32, ((T([512], f16), T([8192, 512], f16), T([512, 512], f16, stride=(1, 512))), {})
cnt: 8, ((T([2048], f16), T([8192, 512], f16), T([512, 2048], f16, stride=(1, 512))), {})
cnt: 8, ((T([512], f16), T([8192, 2048], f16), T([2048, 512], f16, stride=(1, 2048))), {})
Operator: aten.bmm.default
cnt: 16, ((T([1024, 128, 32], f16), T([1024, 32, 128], f16, stride=(4096, 1, 32))), {})
cnt: 16, ((T([1024, 128, 128], f16), T([1024, 128, 32], f16)), {})
cnt: 8, ((T([1024, 128, 128], f16, stride=(16384, 1, 128)), T([1024, 128, 32], f16)), {})
cnt: 8, ((T([1024, 32, 128], f16, stride=(4096, 1, 32)), T([1024, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([64, 128], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([64, 128], i64), T([64, 128], i64)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50265, 512], f16), T([64, 128], i64), 0), {})
cnt: 1, ((T([512, 512], f16), T([128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([128, 512], f16), T([128], i64), 512, -1, False), {})
cnt: 1, ((T([64, 128, 512], f16), T([64, 128], i64), 50265, 0, False), {})
Operator: aten.gelu.default
cnt: 8, ((T([64, 128, 2048], f16),), {})
Operator: aten.gelu_backward.default
cnt: 8, ((T([64, 128, 2048], f16), T([64, 128, 2048], f16)), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([8192, 512], f16), T([512, 50265], f16, stride=(1, 512))), {})
cnt: 1, ((T([50265, 8192], f16, stride=(1, 50265)), T([8192, 512], f16)), {})
cnt: 1, ((T([8192, 50265], f16), T([50265, 512], f16)), {})
cnt: 8, ((T([8192, 512], f16), T([512, 2048], f16)), {})
cnt: 8, ((T([512, 8192], f16, stride=(1, 512)), T([8192, 2048], f16)), {})
cnt: 8, ((T([8192, 2048], f16), T([2048, 512], f16)), {})
cnt: 8, ((T([2048, 8192], f16, stride=(1, 2048)), T([8192, 512], f16)), {})
cnt: 32, ((T([8192, 512], f16), T([512, 512], f16)), {})
cnt: 32, ((T([512, 8192], f16, stride=(1, 512)), T([8192, 512], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([64, 128, 512], f16), 1.0), {})
cnt: 16, ((T([64, 128, 512], f16), 0.1767766952966369), {})
Operator: aten.native_layer_norm.default
cnt: 17, ((T([64, 128, 512], f16), [512], T([512], f16), T([512], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 17, ((T([64, 128, 512], f16), T([64, 128, 512], f16), [512], T([64, 128, 1], f32), T([64, 128, 1], f32), T([512], f16), T([512], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([8192, 50265], f16), T([8192], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([8192, 50265], f16), T([8192], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 40, ((T([8192, 512], f16), [0], True), {})
cnt: 8, ((T([8192, 2048], f16), [0], True), {})
cnt: 1, ((T([64, 128, 512], f16), [0], True), {})

View File

@ -0,0 +1,81 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([8192, 50265], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([8192, 50265], f16), T([8192, 50265], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 24, ((T([1024, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 24, ((T([1024, 128, 128], f16), T([1024, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([64, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 72, ((T([64, 128, 16, 32], f16), [64, 128, 512]), {})
cnt: 1, ((T([8192, 50265], f16), [64, 128, 50265]), {})
cnt: 24, ((T([64, 16, 128, 32], f16), [1024, 128, 32]), {})
cnt: 24, ((T([64, 128, 512], f16), [8192, 512]), {})
Operator: aten.add.Tensor
cnt: 2, ((T([64, 128, 512], f16), T([128, 512], f16)), {})
cnt: 127, ((T([64, 128, 512], f16), T([64, 128, 512], f16)), {})
cnt: 1, ((T([128], i64), 1), {})
cnt: 8, ((T([64, 16, 128, 128], f16), T([64, 1, 128, 128], f16)), {})
cnt: 1, ((T([64, 128, 50265], f16), T([1, 50265], f16)), {})
cnt: 2, ((T([50265, 512], f16), T([50265, 512], f16)), {})
Operator: aten.addmm.default
cnt: 96, ((T([512], f16), T([8192, 512], f16), T([512, 512], f16, stride=(1, 512))), {})
cnt: 16, ((T([2048], f16), T([8192, 512], f16), T([512, 2048], f16, stride=(1, 512))), {})
cnt: 16, ((T([512], f16), T([8192, 2048], f16), T([2048, 512], f16, stride=(1, 2048))), {})
Operator: aten.any.default
cnt: 16, ((T([64, 128, 512], b8),), {})
Operator: aten.bmm.default
cnt: 48, ((T([1024, 128, 32], f16), T([1024, 32, 128], f16, stride=(4096, 1, 32))), {})
cnt: 48, ((T([1024, 128, 128], f16), T([1024, 128, 32], f16)), {})
cnt: 24, ((T([1024, 128, 128], f16, stride=(16384, 1, 128)), T([1024, 128, 32], f16)), {})
cnt: 24, ((T([1024, 32, 128], f16, stride=(4096, 1, 32)), T([1024, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 3, ((T([64, 128], i64),), {})
Operator: aten.copy_.default
cnt: 3, ((T([64, 128], i64), T([64, 128], i64)), {})
Operator: aten.embedding.default
cnt: 2, ((T([50265, 512], f16), T([64, 128], i64), 0), {})
cnt: 2, ((T([512, 512], f16), T([128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 2, ((T([128, 512], f16), T([128], i64), 512, -1, False), {})
cnt: 2, ((T([64, 128, 512], f16), T([64, 128], i64), 50265, 0, False), {})
Operator: aten.gelu.default
cnt: 16, ((T([64, 128, 2048], f16),), {})
Operator: aten.gelu_backward.default
cnt: 16, ((T([64, 128, 2048], f16), T([64, 128, 2048], f16)), {})
Operator: aten.isinf.default
cnt: 8, ((T([64, 128, 512], f16),), {})
Operator: aten.isnan.default
cnt: 8, ((T([64, 128, 512], f16),), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([8192, 512], f16), T([512, 50265], f16, stride=(1, 512))), {})
cnt: 1, ((T([50265, 8192], f16, stride=(1, 50265)), T([8192, 512], f16)), {})
cnt: 1, ((T([8192, 50265], f16), T([50265, 512], f16)), {})
cnt: 16, ((T([8192, 512], f16), T([512, 2048], f16)), {})
cnt: 16, ((T([512, 8192], f16, stride=(1, 512)), T([8192, 2048], f16)), {})
cnt: 16, ((T([8192, 2048], f16), T([2048, 512], f16)), {})
cnt: 16, ((T([2048, 8192], f16, stride=(1, 2048)), T([8192, 512], f16)), {})
cnt: 96, ((T([8192, 512], f16), T([512, 512], f16)), {})
cnt: 96, ((T([512, 8192], f16, stride=(1, 512)), T([8192, 512], f16)), {})
Operator: aten.mul.Tensor
cnt: 4, ((T([64, 128, 512], f16), 1.0), {})
cnt: 48, ((T([64, 128, 512], f16), 0.1767766952966369), {})
Operator: aten.native_layer_norm.default
cnt: 42, ((T([64, 128, 512], f16), [512], T([512], f16), T([512], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 42, ((T([64, 128, 512], f16), T([64, 128, 512], f16), [512], T([64, 128, 1], f32), T([64, 128, 1], f32), T([512], f16), T([512], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([8192, 50265], f16), T([8192], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([8192, 50265], f16), T([8192], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 112, ((T([8192, 512], f16), [0], True), {})
cnt: 16, ((T([8192, 2048], f16), [0], True), {})
cnt: 2, ((T([64, 128, 512], f16), [0], True), {})

View File

@ -0,0 +1,88 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([512, 32005], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([512, 32005], f16), T([512, 32005], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([1, 12, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([1, 12, 512, 512], f16), T([1, 12, 512, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([1, 1, 1, 512], f32),), {'dtype': f16})
cnt: 1, ((T([1, 512], b8),), {'dtype': i32})
cnt: 1, ((T([1, 512], i64),), {'dtype': i32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([1, 512], i32),), {'dtype': i64})
Operator: aten._unsafe_view.default
cnt: 12, ((T([12, 512, 512], f16), [1, 12, 512, 512]), {})
cnt: 12, ((T([12, 512, 64], f16), [1, 12, 512, 64]), {})
cnt: 24, ((T([1, 512, 12, 64], f16), [1, 512, 768]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([1, 512], i32), 0), {})
cnt: 1, ((T([1, 512], i64), 1), {})
cnt: 73, ((T([1, 512, 768], f16), T([1, 512, 768], f16)), {})
cnt: 12, ((T([1, 12, 512, 512], f16), T([1, 1, 1, 512], f16)), {})
cnt: 1, ((T([32005, 768], f16), T([32005, 768], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([1, 512, 768], f16), T([1, 512, 768], f16)), {})
Operator: aten.addmm.default
cnt: 49, ((T([768], f16), T([512, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([512, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([512, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([32005], f16), T([512, 768], f16), T([768, 32005], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 24, ((T([12, 512, 64], f16, stride=(64, 768, 1)), T([12, 64, 512], f16, stride=(64, 1, 768))), {})
cnt: 24, ((T([12, 512, 512], f16), T([12, 512, 64], f16, stride=(64, 768, 1))), {})
cnt: 12, ((T([12, 512, 512], f16, stride=(262144, 1, 512)), T([12, 512, 64], f16, stride=(64, 768, 1))), {})
cnt: 12, ((T([12, 64, 512], f16, stride=(64, 1, 768)), T([12, 512, 512], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([1, 512], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([1, 512], i64), T([1, 512], i64)), {})
Operator: aten.cumsum.default
cnt: 1, ((T([1, 512], i32), 1), {})
Operator: aten.div.Tensor
cnt: 24, ((T([1, 12, 512, 512], f16), 8.0), {})
Operator: aten.embedding.default
cnt: 1, ((T([32005, 768], f16), T([1, 512], i64), 1), {})
cnt: 1, ((T([1, 768], f16), T([1, 512], i64)), {})
cnt: 1, ((T([514, 768], f16), T([1, 512], i64), 1), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 514, 1, False), {})
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 1, -1, False), {})
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 32005, 1, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([1, 512, 3072], f16),), {})
cnt: 1, ((T([1, 512, 768], f16),), {})
Operator: aten.gelu_backward.default
cnt: 1, ((T([1, 512, 768], f16), T([1, 512, 768], f16)), {})
cnt: 12, ((T([1, 512, 3072], f16), T([1, 512, 3072], f16)), {})
Operator: aten.mm.default
cnt: 1, ((T([512, 32005], f16), T([32005, 768], f16)), {})
cnt: 1, ((T([32005, 512], f16, stride=(1, 32005)), T([512, 768], f16)), {})
cnt: 37, ((T([512, 768], f16), T([768, 768], f16)), {})
cnt: 37, ((T([768, 512], f16, stride=(1, 768)), T([512, 768], f16)), {})
cnt: 12, ((T([512, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 512], f16, stride=(1, 768)), T([512, 3072], f16)), {})
cnt: 12, ((T([512, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 512], f16, stride=(1, 3072)), T([512, 768], f16)), {})
cnt: 12, ((T([512, 768], f16, stride=(1, 512)), T([768, 768], f16)), {})
cnt: 12, ((T([768, 512], f16), T([512, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([1, 1, 1, 512], f16), -65504.0), {})
cnt: 1, ((T([1, 512], i32), T([1, 512], i32)), {})
Operator: aten.native_layer_norm.default
cnt: 26, ((T([1, 512, 768], f16), [768], T([768], f16), T([768], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 26, ((T([1, 512, 768], f16), T([1, 512, 768], f16), [768], T([1, 512, 1], f32), T([1, 512, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.ne.Scalar
cnt: 1, ((T([1, 512], i64), 1), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([512, 32005], f16), T([512], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([512, 32005], f16), T([512], i64), None, 1, -100), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([1, 1, 1, 512], f16), 1.0), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([512, 32005], f16), [0], True), {})
cnt: 49, ((T([512, 768], f16), [0], True), {})
cnt: 12, ((T([512, 3072], f16), [0], True), {})
cnt: 12, ((T([512, 768], f16, stride=(1, 512)), [0], True), {})

View File

@ -0,0 +1,132 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([2048, 50265], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([2048, 50265], f16), T([2048, 50265], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([4, 12, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([4, 12, 512, 512], f16), T([4, 12, 512, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 25, ((T([4, 512, 768], f16),), {'dtype': f32})
cnt: 25, ((T([4, 512, 768], f32),), {'dtype': f16})
cnt: 1, ((T([4, 512, 1], f32),), {'dtype': f16})
cnt: 1, ((T([4, 1, 512, 512], f32),), {'dtype': torch.uint8})
cnt: 12, ((T([], f32),), {'dtype': f16, 'device': "torch.device('cpu')"})
cnt: 12, ((T([4, 1, 512, 512], u8),), {'dtype': torch.bool})
cnt: 25, ((T([4, 512, 768], f16),), {'dtype': f32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 25, ((T([4, 512, 768], f32),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 12, ((T([2048, 2304], f16), [4, 512, 2304]), {})
cnt: 36, ((T([4, 12, 512, 64], f16), [48, 512, 64]), {})
cnt: 12, ((T([4, 12, 64, 512], f16), [48, 64, 512]), {})
cnt: 12, ((T([48, 512, 512], f16), [4, 12, 512, 512]), {})
cnt: 12, ((T([48, 512, 64], f16), [4, 12, 512, 64]), {})
cnt: 12, ((T([4, 512, 12, 192], f16), [4, 512, 2304]), {})
Operator: aten.add.Tensor
cnt: 25, ((T([4, 512, 1], f32), 1e-07), {})
cnt: 25, ((T([4, 512, 768], f16), T([768], f16)), {})
cnt: 24, ((T([4, 12, 512, 64], f16, stride=(1179648, 192, 2304, 1)), T([1, 12, 1, 64], f16)), {})
cnt: 48, ((T([4, 512, 768], f16), T([4, 512, 768], f16)), {})
cnt: 50, ((T([4, 512, 768], f32), T([4, 512, 768], f32)), {})
cnt: 25, ((T([4, 512, 1], f32), T([4, 512, 1], f32)), {})
cnt: 1, ((T([50265, 768], f16), T([50265, 768], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([4, 512, 768], f16), T([1, 512, 768], f16)), {})
Operator: aten.addmm.default
cnt: 13, ((T([768], f16), T([2048, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([2048, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([2048, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([50265], f16), T([2048, 768], f16), T([768, 50265], f16, stride=(1, 768))), {})
Operator: aten.bitwise_not.default
cnt: 12, ((T([4, 1, 512, 512], b8),), {})
Operator: aten.bmm.default
cnt: 12, ((T([48, 512, 64], f16), T([48, 64, 512], f16)), {})
cnt: 12, ((T([48, 512, 512], f16), T([48, 512, 64], f16)), {})
cnt: 12, ((T([48, 512, 512], f16, stride=(262144, 1, 512)), T([48, 512, 64], f16)), {})
cnt: 12, ((T([48, 512, 64], f16), T([48, 64, 512], f16, stride=(32768, 1, 64))), {})
cnt: 12, ((T([48, 64, 512], f16, stride=(32768, 1, 64)), T([48, 512, 512], f16)), {})
cnt: 12, ((T([48, 512, 512], f16), T([48, 512, 64], f16, stride=(32768, 1, 512))), {})
Operator: aten.cat.default
cnt: 12, (([T([4, 12, 512, 64], f16), T([4, 12, 512, 64], f16, stride=(393216, 32768, 1, 512)), T([4, 12, 512, 64], f16)], 3), {})
Operator: aten.clone.default
cnt: 2, ((T([4, 512], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([4, 512], i64), T([4, 512], i64)), {})
Operator: aten.div.Scalar
cnt: 50, ((T([4, 512, 768], f32, stride=(512, 1, 0)), 768), {})
Operator: aten.div.Tensor
cnt: 100, ((T([4, 512, 768], f32), T([4, 512, 1], f32)), {})
cnt: 12, ((T([4, 12, 512, 64], f16, stride=(393216, 64, 768, 1)), T([], f16)), {})
cnt: 25, ((T([4, 512, 1], f32), T([4, 512, 1], f32)), {})
cnt: 12, ((T([4, 12, 512, 64], f16), T([], f16)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50265, 768], f16), T([4, 512], i64), 0), {})
cnt: 1, ((T([512, 768], f16), T([1, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 512, -1, False), {})
cnt: 1, ((T([4, 512, 768], f16), T([4, 512], i64), 50265, 0, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([4, 512, 3072], f16),), {})
cnt: 1, ((T([4, 512, 768], f16),), {})
Operator: aten.gelu_backward.default
cnt: 1, ((T([4, 512, 768], f16), T([4, 512, 768], f16)), {})
cnt: 12, ((T([4, 512, 3072], f16), T([4, 512, 3072], f16)), {})
Operator: aten.masked_fill.Tensor
cnt: 12, ((T([4, 12, 512, 512], f16), T([4, 1, 512, 512], b8), T([], f32)), {})
Operator: aten.masked_fill_.Scalar
cnt: 12, ((T([4, 12, 512, 512], f16), T([4, 1, 512, 512], b8), 0), {})
Operator: aten.mean.dim
cnt: 50, ((T([4, 512, 768], f32), [-1], True), {})
Operator: aten.mm.default
cnt: 12, ((T([2048, 768], f16), T([768, 2304], f16, stride=(1, 768))), {})
cnt: 1, ((T([2048, 50265], f16), T([50265, 768], f16)), {})
cnt: 1, ((T([50265, 2048], f16, stride=(1, 50265)), T([2048, 768], f16)), {})
cnt: 13, ((T([2048, 768], f16), T([768, 768], f16)), {})
cnt: 13, ((T([768, 2048], f16, stride=(1, 768)), T([2048, 768], f16)), {})
cnt: 12, ((T([2048, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 2048], f16, stride=(1, 768)), T([2048, 3072], f16)), {})
cnt: 12, ((T([2048, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 2048], f16, stride=(1, 3072)), T([2048, 768], f16)), {})
cnt: 12, ((T([2304, 2048], f16, stride=(1, 2304)), T([2048, 768], f16)), {})
cnt: 12, ((T([2048, 2304], f16), T([2304, 768], f16)), {})
Operator: aten.mul.Scalar
cnt: 25, ((T([4, 512, 1], f32), 2), {})
cnt: 25, ((T([4, 512, 768], f32), 2.0), {})
Operator: aten.mul.Tensor
cnt: 25, ((T([768], f16), T([4, 512, 768], f16)), {})
cnt: 2, ((T([4, 512, 768], f16), T([4, 512, 1], f16)), {})
cnt: 1, ((T([4, 1, 1, 512], f32), T([4, 1, 512, 1], f32)), {})
cnt: 12, ((T([], f32), 1), {})
cnt: 25, ((T([4, 512, 768], f16), T([768], f16)), {})
cnt: 25, ((T([4, 512, 768], f16), T([4, 512, 768], f16)), {})
cnt: 50, ((T([4, 512, 768], f32), T([4, 512, 768], f32)), {})
Operator: aten.native_layer_norm.default
cnt: 1, ((T([4, 512, 768], f16), [768], T([768], f16), T([768], f16), 1e-07), {})
Operator: aten.native_layer_norm_backward.default
cnt: 1, ((T([4, 512, 768], f16), T([4, 512, 768], f16), [768], T([4, 512, 1], f32), T([4, 512, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.neg.default
cnt: 75, ((T([4, 512, 768], f32),), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([2048, 50265], f16), T([2048], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([2048, 50265], f16), T([2048], i64), None, 1, -100), {})
Operator: aten.pow.Tensor_Scalar
cnt: 25, ((T([4, 512, 768], f32), 2), {})
cnt: 25, ((T([4, 512, 768], f32), 1.0), {})
Operator: aten.slice_backward.default
cnt: 24, ((T([1, 1, 768], f16), [1, 1, 768], 2, 0, 9223372036854775807, 1), {})
Operator: aten.split.Tensor
cnt: 12, ((T([4, 12, 512, 192], f16, stride=(1179648, 192, 2304, 1)), 64, -1), {})
Operator: aten.sqrt.default
cnt: 25, ((T([4, 512, 1], f32),), {})
cnt: 12, ((T([], f32),), {})
Operator: aten.sub.Tensor
cnt: 50, ((T([4, 512, 768], f32), T([4, 512, 1], f32)), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([2048, 50265], f16), [0], True), {})
cnt: 25, ((T([2048, 768], f16), [0], True), {})
cnt: 50, ((T([4, 512, 768], f16), [0, 1], True), {})
cnt: 75, ((T([4, 512, 768], f32), [2], True), {})
cnt: 12, ((T([2048, 3072], f16), [0], True), {})
cnt: 24, ((T([4, 12, 512, 64], f16), [0, 2], True), {})
cnt: 1, ((T([4, 512, 768], f16), [0], True), {})

View File

@ -0,0 +1,133 @@
Operator: aten._log_softmax.default
cnt: 2, ((T([4, 512], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 2, ((T([4, 512], f16), T([4, 512], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([4, 12, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([4, 12, 512, 512], f16), T([4, 12, 512, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 25, ((T([4, 512, 768], f16),), {'dtype': f32})
cnt: 25, ((T([4, 512, 768], f32),), {'dtype': f16})
cnt: 1, ((T([4, 512, 1], f32),), {'dtype': f16})
cnt: 1, ((T([4, 1, 512, 512], f32),), {'dtype': torch.uint8})
cnt: 12, ((T([], f32),), {'dtype': f16, 'device': "torch.device('cpu')"})
cnt: 12, ((T([4, 1, 512, 512], u8),), {'dtype': torch.bool})
cnt: 25, ((T([4, 512, 768], f16),), {'dtype': f32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 25, ((T([4, 512, 768], f32),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 12, ((T([2048, 2304], f16), [4, 512, 2304]), {})
cnt: 36, ((T([4, 12, 512, 64], f16), [48, 512, 64]), {})
cnt: 12, ((T([4, 12, 64, 512], f16), [48, 64, 512]), {})
cnt: 12, ((T([48, 512, 512], f16), [4, 12, 512, 512]), {})
cnt: 12, ((T([48, 512, 64], f16), [4, 12, 512, 64]), {})
cnt: 12, ((T([4, 512, 12, 192], f16), [4, 512, 2304]), {})
Operator: aten.add.Tensor
cnt: 25, ((T([4, 512, 1], f32), 1e-07), {})
cnt: 25, ((T([4, 512, 768], f16), T([768], f16)), {})
cnt: 24, ((T([4, 12, 512, 64], f16, stride=(1179648, 192, 2304, 1)), T([1, 12, 1, 64], f16)), {})
cnt: 48, ((T([4, 512, 768], f16), T([4, 512, 768], f16)), {})
cnt: 1, ((T([], f16), T([], f16)), {})
cnt: 50, ((T([4, 512, 768], f32), T([4, 512, 768], f32)), {})
cnt: 25, ((T([4, 512, 1], f32), T([4, 512, 1], f32)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([4, 512, 768], f16), T([1, 512, 768], f16)), {})
Operator: aten.addmm.default
cnt: 12, ((T([768], f16), T([2048, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([2048, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([2048, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([2], f16), T([2048, 768], f16), T([768, 2], f16, stride=(1, 768))), {})
Operator: aten.bitwise_not.default
cnt: 12, ((T([4, 1, 512, 512], b8),), {})
Operator: aten.bmm.default
cnt: 12, ((T([48, 512, 64], f16), T([48, 64, 512], f16)), {})
cnt: 12, ((T([48, 512, 512], f16), T([48, 512, 64], f16)), {})
cnt: 12, ((T([48, 512, 512], f16, stride=(262144, 1, 512)), T([48, 512, 64], f16)), {})
cnt: 12, ((T([48, 512, 64], f16), T([48, 64, 512], f16, stride=(32768, 1, 64))), {})
cnt: 12, ((T([48, 64, 512], f16, stride=(32768, 1, 64)), T([48, 512, 512], f16)), {})
cnt: 12, ((T([48, 512, 512], f16), T([48, 512, 64], f16, stride=(32768, 1, 512))), {})
Operator: aten.cat.default
cnt: 1, (([T([4, 512, 1], f16), T([4, 512, 1], f16)], 2), {})
cnt: 12, (([T([4, 12, 512, 64], f16), T([4, 12, 512, 64], f16, stride=(393216, 32768, 1, 512)), T([4, 12, 512, 64], f16)], 3), {})
Operator: aten.clamp.default
cnt: 2, ((T([4], i64), 0, 512), {})
Operator: aten.clone.default
cnt: 1, ((T([4, 512], i64),), {})
cnt: 2, ((T([4], i64),), {})
Operator: aten.copy_.default
cnt: 1, ((T([4, 512], i64), T([4, 512], i64)), {})
cnt: 2, ((T([4], i64), T([4], i64)), {})
Operator: aten.div.Scalar
cnt: 50, ((T([4, 512, 768], f32, stride=(512, 1, 0)), 768), {})
Operator: aten.div.Tensor
cnt: 100, ((T([4, 512, 768], f32), T([4, 512, 1], f32)), {})
cnt: 12, ((T([4, 12, 512, 64], f16, stride=(393216, 64, 768, 1)), T([], f16)), {})
cnt: 2, ((T([], f16), 2), {})
cnt: 25, ((T([4, 512, 1], f32), T([4, 512, 1], f32)), {})
cnt: 12, ((T([4, 12, 512, 64], f16), T([], f16)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50265, 768], f16), T([4, 512], i64), 0), {})
cnt: 1, ((T([512, 768], f16), T([1, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 512, -1, False), {})
cnt: 1, ((T([4, 512, 768], f16), T([4, 512], i64), 50265, 0, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([4, 512, 3072], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([4, 512, 3072], f16), T([4, 512, 3072], f16)), {})
Operator: aten.masked_fill.Tensor
cnt: 12, ((T([4, 12, 512, 512], f16), T([4, 1, 512, 512], b8), T([], f32)), {})
Operator: aten.masked_fill_.Scalar
cnt: 12, ((T([4, 12, 512, 512], f16), T([4, 1, 512, 512], b8), 0), {})
Operator: aten.mean.dim
cnt: 50, ((T([4, 512, 768], f32), [-1], True), {})
Operator: aten.mm.default
cnt: 12, ((T([2048, 768], f16), T([768, 2304], f16, stride=(1, 768))), {})
cnt: 1, ((T([2048, 2], f16), T([2, 768], f16)), {})
cnt: 1, ((T([2, 2048], f16, stride=(1, 2)), T([2048, 768], f16)), {})
cnt: 12, ((T([2048, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 2048], f16, stride=(1, 768)), T([2048, 3072], f16)), {})
cnt: 12, ((T([2048, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 2048], f16, stride=(1, 3072)), T([2048, 768], f16)), {})
cnt: 12, ((T([2048, 768], f16), T([768, 768], f16)), {})
cnt: 12, ((T([768, 2048], f16, stride=(1, 768)), T([2048, 768], f16)), {})
cnt: 12, ((T([2304, 2048], f16, stride=(1, 2304)), T([2048, 768], f16)), {})
cnt: 12, ((T([2048, 2304], f16), T([2304, 768], f16)), {})
Operator: aten.mul.Scalar
cnt: 25, ((T([4, 512, 1], f32), 2), {})
cnt: 25, ((T([4, 512, 768], f32), 2.0), {})
Operator: aten.mul.Tensor
cnt: 25, ((T([768], f16), T([4, 512, 768], f16)), {})
cnt: 2, ((T([4, 512, 768], f16), T([4, 512, 1], f16)), {})
cnt: 1, ((T([4, 1, 1, 512], f32), T([4, 1, 512, 1], f32)), {})
cnt: 12, ((T([], f32), 1), {})
cnt: 25, ((T([4, 512, 768], f16), T([768], f16)), {})
cnt: 25, ((T([4, 512, 768], f16), T([4, 512, 768], f16)), {})
cnt: 50, ((T([4, 512, 768], f32), T([4, 512, 768], f32)), {})
Operator: aten.neg.default
cnt: 75, ((T([4, 512, 768], f32),), {})
Operator: aten.nll_loss_backward.default
cnt: 2, ((T([], f16), T([4, 512], f16), T([4], i64), None, 1, 512, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 2, ((T([4, 512], f16), T([4], i64), None, 1, 512), {})
Operator: aten.pow.Tensor_Scalar
cnt: 25, ((T([4, 512, 768], f32), 2), {})
cnt: 25, ((T([4, 512, 768], f32), 1.0), {})
Operator: aten.slice_backward.default
cnt: 24, ((T([1, 1, 768], f16), [1, 1, 768], 2, 0, 9223372036854775807, 1), {})
Operator: aten.split.Tensor
cnt: 12, ((T([4, 12, 512, 192], f16, stride=(1179648, 192, 2304, 1)), 64, -1), {})
cnt: 1, ((T([4, 512, 2], f16), 1, -1), {})
Operator: aten.sqrt.default
cnt: 25, ((T([4, 512, 1], f32),), {})
cnt: 12, ((T([], f32),), {})
Operator: aten.sub.Tensor
cnt: 50, ((T([4, 512, 768], f32), T([4, 512, 1], f32)), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([2048, 2], f16), [0], True), {})
cnt: 50, ((T([4, 512, 768], f16), [0, 1], True), {})
cnt: 75, ((T([4, 512, 768], f32), [2], True), {})
cnt: 24, ((T([2048, 768], f16), [0], True), {})
cnt: 12, ((T([2048, 3072], f16), [0], True), {})
cnt: 24, ((T([4, 12, 512, 64], f16), [0, 2], True), {})
cnt: 1, ((T([4, 512, 768], f16), [0], True), {})

View File

@ -0,0 +1,85 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([512, 128100], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([512, 128100], f16), T([512, 128100], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 24, ((T([1, 24, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 24, ((T([1, 24, 512, 512], f16), T([1, 24, 512, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([1, 512, 1], f32),), {'dtype': f16})
cnt: 1, ((T([1, 1, 512, 512], f32),), {'dtype': torch.uint8})
cnt: 24, ((T([], f32),), {'dtype': f16, 'device': "torch.device('cpu')"})
cnt: 24, ((T([1, 1, 512, 512], u8),), {'dtype': torch.bool})
Operator: aten._unsafe_view.default
cnt: 48, ((T([1, 512, 24, 64], f16), [1, 512, 1536]), {})
Operator: aten.add.Tensor
cnt: 144, ((T([1, 512, 1536], f16), T([1, 512, 1536], f16)), {})
cnt: 1, ((T([128100, 1536], f16), T([128100, 1536], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([1, 512, 1536], f16), T([1, 512, 1536], f16)), {})
Operator: aten.addmm.default
cnt: 97, ((T([1536], f16), T([512, 1536], f16), T([1536, 1536], f16, stride=(1, 1536))), {})
cnt: 24, ((T([6144], f16), T([512, 1536], f16), T([1536, 6144], f16, stride=(1, 1536))), {})
cnt: 24, ((T([1536], f16), T([512, 6144], f16), T([6144, 1536], f16, stride=(1, 6144))), {})
cnt: 1, ((T([128100], f16), T([512, 1536], f16), T([1536, 128100], f16, stride=(1, 1536))), {})
Operator: aten.bitwise_not.default
cnt: 24, ((T([1, 1, 512, 512], b8),), {})
Operator: aten.bmm.default
cnt: 24, ((T([24, 512, 64], f16), T([24, 64, 512], f16, stride=(32768, 1, 64))), {})
cnt: 48, ((T([24, 512, 512], f16), T([24, 512, 64], f16)), {})
cnt: 24, ((T([24, 512, 512], f16, stride=(262144, 1, 512)), T([24, 512, 64], f16, stride=(64, 1536, 1))), {})
cnt: 24, ((T([24, 512, 64], f16, stride=(64, 1536, 1)), T([24, 64, 512], f16, stride=(32768, 1, 64))), {})
cnt: 24, ((T([24, 64, 512], f16, stride=(32768, 1, 64)), T([24, 512, 512], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([1, 512], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([1, 512], i64), T([1, 512], i64)), {})
Operator: aten.div.Tensor
cnt: 48, ((T([24, 512, 512], f16), T([], f16)), {})
Operator: aten.embedding.default
cnt: 1, ((T([128100, 1536], f16), T([1, 512], i64), 0), {})
cnt: 1, ((T([512, 1536], f16), T([1, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 512, 1536], f16), T([1, 512], i64), 512, -1, False), {})
cnt: 1, ((T([1, 512, 1536], f16), T([1, 512], i64), 128100, 0, False), {})
Operator: aten.gelu.default
cnt: 24, ((T([1, 512, 6144], f16),), {})
cnt: 1, ((T([1, 512, 1536], f16),), {})
Operator: aten.gelu_backward.default
cnt: 1, ((T([1, 512, 1536], f16), T([1, 512, 1536], f16)), {})
cnt: 24, ((T([1, 512, 6144], f16), T([1, 512, 6144], f16)), {})
Operator: aten.masked_fill.Tensor
cnt: 24, ((T([1, 24, 512, 512], f16), T([1, 1, 512, 512], b8), T([], f32)), {})
Operator: aten.masked_fill_.Scalar
cnt: 24, ((T([1, 24, 512, 512], f16), T([1, 1, 512, 512], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([512, 128100], f16), T([128100, 1536], f16)), {})
cnt: 1, ((T([128100, 512], f16, stride=(1, 128100)), T([512, 1536], f16)), {})
cnt: 73, ((T([512, 1536], f16), T([1536, 1536], f16)), {})
cnt: 73, ((T([1536, 512], f16, stride=(1, 1536)), T([512, 1536], f16)), {})
cnt: 24, ((T([512, 1536], f16), T([1536, 6144], f16)), {})
cnt: 24, ((T([1536, 512], f16, stride=(1, 1536)), T([512, 6144], f16)), {})
cnt: 24, ((T([512, 6144], f16), T([6144, 1536], f16)), {})
cnt: 24, ((T([6144, 512], f16, stride=(1, 6144)), T([512, 1536], f16)), {})
cnt: 24, ((T([512, 1536], f16, stride=(1, 512)), T([1536, 1536], f16)), {})
cnt: 24, ((T([1536, 512], f16), T([512, 1536], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([1, 512, 1536], f16), T([1, 512, 1], f16)), {})
cnt: 1, ((T([1, 1, 1, 512], f32), T([1, 1, 512, 1], f32)), {})
cnt: 24, ((T([], f32), 1), {})
Operator: aten.native_layer_norm.default
cnt: 50, ((T([1, 512, 1536], f16), [1536], T([1536], f16), T([1536], f16), 1e-07), {})
Operator: aten.native_layer_norm_backward.default
cnt: 50, ((T([1, 512, 1536], f16), T([1, 512, 1536], f16), [1536], T([1, 512, 1], f32), T([1, 512, 1], f32), T([1536], f16), T([1536], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([512, 128100], f16), T([512], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([512, 128100], f16), T([512], i64), None, 1, -100), {})
Operator: aten.sqrt.default
cnt: 24, ((T([], f32),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([512, 128100], f16), [0], True), {})
cnt: 97, ((T([512, 1536], f16), [0], True), {})
cnt: 24, ((T([512, 6144], f16), [0], True), {})
cnt: 24, ((T([512, 1536], f16, stride=(1, 512)), [0], True), {})

View File

@ -0,0 +1,92 @@
Operator: aten._log_softmax.default
cnt: 2, ((T([1, 512], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 2, ((T([1, 512], f16), T([1, 512], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 24, ((T([1, 24, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 24, ((T([1, 24, 512, 512], f16), T([1, 24, 512, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([1, 512, 1], f32),), {'dtype': f16})
cnt: 1, ((T([1, 1, 512, 512], f32),), {'dtype': torch.uint8})
cnt: 24, ((T([], f32),), {'dtype': f16, 'device': "torch.device('cpu')"})
cnt: 24, ((T([1, 1, 512, 512], u8),), {'dtype': torch.bool})
Operator: aten._unsafe_view.default
cnt: 48, ((T([1, 512, 24, 64], f16), [1, 512, 1536]), {})
Operator: aten.add.Tensor
cnt: 144, ((T([1, 512, 1536], f16), T([1, 512, 1536], f16)), {})
cnt: 1, ((T([], f16), T([], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([1, 512, 1536], f16), T([1, 512, 1536], f16)), {})
Operator: aten.addmm.default
cnt: 96, ((T([1536], f16), T([512, 1536], f16), T([1536, 1536], f16, stride=(1, 1536))), {})
cnt: 24, ((T([6144], f16), T([512, 1536], f16), T([1536, 6144], f16, stride=(1, 1536))), {})
cnt: 24, ((T([1536], f16), T([512, 6144], f16), T([6144, 1536], f16, stride=(1, 6144))), {})
cnt: 1, ((T([2], f16), T([512, 1536], f16), T([1536, 2], f16, stride=(1, 1536))), {})
Operator: aten.bitwise_not.default
cnt: 24, ((T([1, 1, 512, 512], b8),), {})
Operator: aten.bmm.default
cnt: 24, ((T([24, 512, 64], f16), T([24, 64, 512], f16, stride=(32768, 1, 64))), {})
cnt: 48, ((T([24, 512, 512], f16), T([24, 512, 64], f16)), {})
cnt: 24, ((T([24, 512, 512], f16, stride=(262144, 1, 512)), T([24, 512, 64], f16, stride=(64, 1536, 1))), {})
cnt: 24, ((T([24, 512, 64], f16, stride=(64, 1536, 1)), T([24, 64, 512], f16, stride=(32768, 1, 64))), {})
cnt: 24, ((T([24, 64, 512], f16, stride=(32768, 1, 64)), T([24, 512, 512], f16)), {})
Operator: aten.cat.default
cnt: 1, (([T([1, 512, 1], f16), T([1, 512, 1], f16)], 2), {})
Operator: aten.clamp.default
cnt: 2, ((T([1], i64), 0, 512), {})
Operator: aten.clone.default
cnt: 1, ((T([1, 512], i64),), {})
cnt: 2, ((T([1], i64),), {})
Operator: aten.copy_.default
cnt: 1, ((T([1, 512], i64), T([1, 512], i64)), {})
cnt: 2, ((T([1], i64), T([1], i64)), {})
Operator: aten.div.Tensor
cnt: 48, ((T([24, 512, 512], f16), T([], f16)), {})
cnt: 2, ((T([], f16), 2), {})
Operator: aten.embedding.default
cnt: 1, ((T([128100, 1536], f16), T([1, 512], i64), 0), {})
cnt: 1, ((T([512, 1536], f16), T([1, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 512, 1536], f16), T([1, 512], i64), 512, -1, False), {})
cnt: 1, ((T([1, 512, 1536], f16), T([1, 512], i64), 128100, 0, False), {})
Operator: aten.gelu.default
cnt: 24, ((T([1, 512, 6144], f16),), {})
Operator: aten.gelu_backward.default
cnt: 24, ((T([1, 512, 6144], f16), T([1, 512, 6144], f16)), {})
Operator: aten.masked_fill.Tensor
cnt: 24, ((T([1, 24, 512, 512], f16), T([1, 1, 512, 512], b8), T([], f32)), {})
Operator: aten.masked_fill_.Scalar
cnt: 24, ((T([1, 24, 512, 512], f16), T([1, 1, 512, 512], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([512, 2], f16), T([2, 1536], f16)), {})
cnt: 1, ((T([2, 512], f16, stride=(1, 2)), T([512, 1536], f16)), {})
cnt: 24, ((T([512, 1536], f16), T([1536, 6144], f16)), {})
cnt: 24, ((T([1536, 512], f16, stride=(1, 1536)), T([512, 6144], f16)), {})
cnt: 24, ((T([512, 6144], f16), T([6144, 1536], f16)), {})
cnt: 24, ((T([6144, 512], f16, stride=(1, 6144)), T([512, 1536], f16)), {})
cnt: 72, ((T([512, 1536], f16), T([1536, 1536], f16)), {})
cnt: 72, ((T([1536, 512], f16, stride=(1, 1536)), T([512, 1536], f16)), {})
cnt: 24, ((T([512, 1536], f16, stride=(1, 512)), T([1536, 1536], f16)), {})
cnt: 24, ((T([1536, 512], f16), T([512, 1536], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([1, 512, 1536], f16), T([1, 512, 1], f16)), {})
cnt: 1, ((T([1, 1, 1, 512], f32), T([1, 1, 512, 1], f32)), {})
cnt: 24, ((T([], f32), 1), {})
Operator: aten.native_layer_norm.default
cnt: 49, ((T([1, 512, 1536], f16), [1536], T([1536], f16), T([1536], f16), 1e-07), {})
Operator: aten.native_layer_norm_backward.default
cnt: 49, ((T([1, 512, 1536], f16), T([1, 512, 1536], f16), [1536], T([1, 512, 1], f32), T([1, 512, 1], f32), T([1536], f16), T([1536], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 2, ((T([], f16), T([1, 512], f16), T([1], i64), None, 1, 512, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 2, ((T([1, 512], f16), T([1], i64), None, 1, 512), {})
Operator: aten.split.Tensor
cnt: 1, ((T([1, 512, 2], f16), 1, -1), {})
Operator: aten.sqrt.default
cnt: 24, ((T([], f32),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([512, 2], f16), [0], True), {})
cnt: 96, ((T([512, 1536], f16), [0], True), {})
cnt: 24, ((T([512, 6144], f16), [0], True), {})
cnt: 24, ((T([512, 1536], f16, stride=(1, 512)), [0], True), {})

View File

@ -0,0 +1,78 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([2048, 30522], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([2048, 30522], f16), T([2048, 30522], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 6, ((T([16, 12, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 6, ((T([16, 12, 128, 128], f16), T([16, 12, 128, 128], f16), -1, f16), {})
Operator: aten._unsafe_view.default
cnt: 18, ((T([16, 12, 128, 64], f16), [192, 128, 64]), {})
cnt: 6, ((T([16, 12, 64, 128], f16), [192, 64, 128]), {})
cnt: 6, ((T([192, 128, 128], f16), [16, 12, 128, 128]), {})
cnt: 6, ((T([192, 128, 64], f16), [16, 12, 128, 64]), {})
cnt: 12, ((T([16, 128, 12, 64], f16), [16, 128, 768]), {})
cnt: 6, ((T([16, 128, 768], f16), [2048, 768]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([16, 128, 768], f16), T([1, 128, 768], f16)), {})
cnt: 36, ((T([16, 128, 768], f16), T([16, 128, 768], f16)), {})
cnt: 1, ((T([30522, 768], f16), T([30522, 768], f16)), {})
Operator: aten.addmm.default
cnt: 25, ((T([768], f16), T([2048, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 6, ((T([3072], f16), T([2048, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 6, ((T([768], f16), T([2048, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([30522], f16), T([2048, 768], f16), T([768, 30522], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 6, ((T([192, 128, 64], f16), T([192, 64, 128], f16)), {})
cnt: 6, ((T([192, 128, 128], f16), T([192, 128, 64], f16)), {})
cnt: 6, ((T([192, 128, 128], f16, stride=(16384, 1, 128)), T([192, 128, 64], f16)), {})
cnt: 6, ((T([192, 128, 64], f16), T([192, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 6, ((T([192, 64, 128], f16, stride=(8192, 1, 64)), T([192, 128, 128], f16)), {})
cnt: 6, ((T([192, 128, 128], f16), T([192, 128, 64], f16, stride=(8192, 1, 128))), {})
Operator: aten.clone.default
cnt: 2, ((T([16, 128], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([16, 128], i64), T([16, 128], i64)), {})
Operator: aten.div.Tensor
cnt: 6, ((T([16, 12, 128, 64], f16, stride=(98304, 64, 768, 1)), 8.0), {})
cnt: 6, ((T([16, 12, 128, 64], f16), 8.0), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 768], f16), T([16, 128], i64), 0), {})
cnt: 1, ((T([512, 768], f16), T([1, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 128, 768], f16), T([1, 128], i64), 512, -1, False), {})
cnt: 1, ((T([16, 128, 768], f16), T([16, 128], i64), 30522, 0, False), {})
Operator: aten.eq.Scalar
cnt: 6, ((T([16, 128], f32), 0), {})
Operator: aten.gelu.default
cnt: 6, ((T([16, 128, 3072], f16),), {})
cnt: 1, ((T([16, 128, 768], f16),), {})
Operator: aten.gelu_backward.default
cnt: 1, ((T([16, 128, 768], f16), T([16, 128, 768], f16)), {})
cnt: 6, ((T([16, 128, 3072], f16), T([16, 128, 3072], f16)), {})
Operator: aten.masked_fill.Scalar
cnt: 6, ((T([16, 12, 128, 128], f16), T([16, 12, 128, 128], b8, stride=(128, 0, 0, 1)), 0), {})
Operator: aten.masked_fill.Tensor
cnt: 6, ((T([16, 12, 128, 128], f16), T([16, 12, 128, 128], b8, stride=(128, 0, 0, 1)), T([], f32)), {})
Operator: aten.mm.default
cnt: 1, ((T([2048, 30522], f16), T([30522, 768], f16)), {})
cnt: 1, ((T([30522, 2048], f16, stride=(1, 30522)), T([2048, 768], f16)), {})
cnt: 25, ((T([2048, 768], f16), T([768, 768], f16)), {})
cnt: 25, ((T([768, 2048], f16, stride=(1, 768)), T([2048, 768], f16)), {})
cnt: 6, ((T([2048, 768], f16), T([768, 3072], f16)), {})
cnt: 6, ((T([768, 2048], f16, stride=(1, 768)), T([2048, 3072], f16)), {})
cnt: 6, ((T([2048, 3072], f16), T([3072, 768], f16)), {})
cnt: 6, ((T([3072, 2048], f16, stride=(1, 3072)), T([2048, 768], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 14, ((T([16, 128, 768], f16), [768], T([768], f16), T([768], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 14, ((T([16, 128, 768], f16), T([16, 128, 768], f16), [768], T([16, 128, 1], f32), T([16, 128, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([2048, 30522], f16), T([2048], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([2048, 30522], f16), T([2048], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([2048, 30522], f16), [0], True), {})
cnt: 31, ((T([2048, 768], f16), [0], True), {})
cnt: 6, ((T([2048, 3072], f16), [0], True), {})
cnt: 1, ((T([16, 128, 768], f16), [0], True), {})

View File

@ -0,0 +1,85 @@
Operator: aten._log_softmax.default
cnt: 2, ((T([32, 128], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 2, ((T([32, 128], f16), T([32, 128], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 6, ((T([32, 12, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 6, ((T([32, 12, 128, 128], f16), T([32, 12, 128, 128], f16), -1, f16), {})
Operator: aten._unsafe_view.default
cnt: 18, ((T([32, 12, 128, 64], f16), [384, 128, 64]), {})
cnt: 6, ((T([32, 12, 64, 128], f16), [384, 64, 128]), {})
cnt: 6, ((T([384, 128, 128], f16), [32, 12, 128, 128]), {})
cnt: 6, ((T([384, 128, 64], f16), [32, 12, 128, 64]), {})
cnt: 12, ((T([32, 128, 12, 64], f16), [32, 128, 768]), {})
cnt: 6, ((T([32, 128, 768], f16), [4096, 768]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([32, 128, 768], f16), T([1, 128, 768], f16)), {})
cnt: 36, ((T([32, 128, 768], f16), T([32, 128, 768], f16)), {})
cnt: 1, ((T([], f16), T([], f16)), {})
Operator: aten.addmm.default
cnt: 24, ((T([768], f16), T([4096, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 6, ((T([3072], f16), T([4096, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 6, ((T([768], f16), T([4096, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([2], f16), T([4096, 768], f16), T([768, 2], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 6, ((T([384, 128, 64], f16), T([384, 64, 128], f16)), {})
cnt: 6, ((T([384, 128, 128], f16), T([384, 128, 64], f16)), {})
cnt: 6, ((T([384, 128, 128], f16, stride=(16384, 1, 128)), T([384, 128, 64], f16)), {})
cnt: 6, ((T([384, 128, 64], f16), T([384, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 6, ((T([384, 64, 128], f16, stride=(8192, 1, 64)), T([384, 128, 128], f16)), {})
cnt: 6, ((T([384, 128, 128], f16), T([384, 128, 64], f16, stride=(8192, 1, 128))), {})
Operator: aten.cat.default
cnt: 1, (([T([32, 128, 1], f16), T([32, 128, 1], f16)], 2), {})
Operator: aten.clamp.default
cnt: 2, ((T([32], i64), 0, 128), {})
Operator: aten.clone.default
cnt: 1, ((T([32, 128], i64),), {})
cnt: 2, ((T([32], i64),), {})
Operator: aten.copy_.default
cnt: 1, ((T([32, 128], i64), T([32, 128], i64)), {})
cnt: 2, ((T([32], i64), T([32], i64)), {})
Operator: aten.div.Tensor
cnt: 6, ((T([32, 12, 128, 64], f16, stride=(98304, 64, 768, 1)), 8.0), {})
cnt: 2, ((T([], f16), 2), {})
cnt: 6, ((T([32, 12, 128, 64], f16), 8.0), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 768], f16), T([32, 128], i64), 0), {})
cnt: 1, ((T([512, 768], f16), T([1, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 128, 768], f16), T([1, 128], i64), 512, -1, False), {})
cnt: 1, ((T([32, 128, 768], f16), T([32, 128], i64), 30522, 0, False), {})
Operator: aten.eq.Scalar
cnt: 6, ((T([32, 128], f32), 0), {})
Operator: aten.gelu.default
cnt: 6, ((T([32, 128, 3072], f16),), {})
Operator: aten.gelu_backward.default
cnt: 6, ((T([32, 128, 3072], f16), T([32, 128, 3072], f16)), {})
Operator: aten.masked_fill.Scalar
cnt: 6, ((T([32, 12, 128, 128], f16), T([32, 12, 128, 128], b8, stride=(128, 0, 0, 1)), 0), {})
Operator: aten.masked_fill.Tensor
cnt: 6, ((T([32, 12, 128, 128], f16), T([32, 12, 128, 128], b8, stride=(128, 0, 0, 1)), T([], f32)), {})
Operator: aten.mm.default
cnt: 1, ((T([4096, 2], f16), T([2, 768], f16)), {})
cnt: 1, ((T([2, 4096], f16, stride=(1, 2)), T([4096, 768], f16)), {})
cnt: 6, ((T([4096, 768], f16), T([768, 3072], f16)), {})
cnt: 6, ((T([768, 4096], f16, stride=(1, 768)), T([4096, 3072], f16)), {})
cnt: 6, ((T([4096, 3072], f16), T([3072, 768], f16)), {})
cnt: 6, ((T([3072, 4096], f16, stride=(1, 3072)), T([4096, 768], f16)), {})
cnt: 24, ((T([4096, 768], f16), T([768, 768], f16)), {})
cnt: 24, ((T([768, 4096], f16, stride=(1, 768)), T([4096, 768], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 13, ((T([32, 128, 768], f16), [768], T([768], f16), T([768], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 13, ((T([32, 128, 768], f16), T([32, 128, 768], f16), [768], T([32, 128, 1], f32), T([32, 128, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 2, ((T([], f16), T([32, 128], f16), T([32], i64), None, 1, 128, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 2, ((T([32, 128], f16), T([32], i64), None, 1, 128), {})
Operator: aten.split.Tensor
cnt: 1, ((T([32, 128, 2], f16), 1, -1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([4096, 2], f16), [0], True), {})
cnt: 30, ((T([4096, 768], f16), [0], True), {})
cnt: 6, ((T([4096, 3072], f16), [0], True), {})
cnt: 1, ((T([32, 128, 768], f16), [0], True), {})

View File

@ -0,0 +1,91 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([511, 50257], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([511, 50257], f16), T([511, 50257], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 6, ((T([1, 12, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 6, ((T([1, 12, 512, 512], f16), T([1, 12, 512, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 6, ((T([1, 1, 512, 512], u8, stride=(1048576, 1048576, 1024, 1)),), {'dtype': torch.bool})
cnt: 6, ((T([], f16),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 6, ((T([12, 512, 512], f16), [1, 12, 512, 512]), {})
cnt: 6, ((T([12, 512, 64], f16), [1, 12, 512, 64]), {})
cnt: 1, ((T([512, 50257], f16), [1, 512, 50257]), {})
cnt: 12, ((T([1, 512, 12, 64], f16), [1, 512, 768]), {})
Operator: aten.add.Tensor
cnt: 25, ((T([1, 512, 768], f16), T([1, 512, 768], f16)), {})
cnt: 18, ((T([1, 512, 3072], f16), T([1, 512, 3072], f16)), {})
cnt: 6, ((T([1, 512, 3072], f16), 1.0), {})
cnt: 1, ((T([50257, 768], f16), T([50257, 768], f16)), {})
Operator: aten.addmm.default
cnt: 6, ((T([2304], f16), T([512, 768], f16), T([768, 2304], f16)), {})
cnt: 6, ((T([768], f16), T([512, 768], f16), T([768, 768], f16)), {})
cnt: 6, ((T([3072], f16), T([512, 768], f16), T([768, 3072], f16)), {})
cnt: 6, ((T([768], f16), T([512, 3072], f16), T([3072, 768], f16)), {})
Operator: aten.bmm.default
cnt: 6, ((T([12, 512, 64], f16, stride=(64, 2304, 1)), T([12, 64, 512], f16, stride=(64, 1, 2304))), {})
cnt: 12, ((T([12, 512, 512], f16), T([12, 512, 64], f16, stride=(64, 2304, 1))), {})
cnt: 6, ((T([12, 512, 512], f16, stride=(262144, 1, 512)), T([12, 512, 64], f16, stride=(64, 768, 1))), {})
cnt: 6, ((T([12, 512, 64], f16, stride=(64, 768, 1)), T([12, 64, 512], f16, stride=(64, 1, 2304))), {})
cnt: 6, ((T([12, 64, 512], f16, stride=(64, 1, 2304)), T([12, 512, 512], f16)), {})
Operator: aten.cat.default
cnt: 6, (([T([1, 512, 768], f16), T([1, 512, 768], f16, stride=(512, 1, 512)), T([1, 512, 768], f16)], 2), {})
Operator: aten.clone.default
cnt: 2, ((T([1, 512], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([1, 512], i64), T([1, 512], i64)), {})
Operator: aten.div.Tensor
cnt: 12, ((T([1, 12, 512, 512], f16), T([], f16)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50257, 768], f16), T([1, 512], i64)), {})
cnt: 1, ((T([1024, 768], f16), T([1, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 1024, -1, False), {})
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 50257, -1, False), {})
Operator: aten.mm.default
cnt: 1, ((T([512, 768], f16), T([768, 50257], f16, stride=(1, 768))), {})
cnt: 1, ((T([50257, 512], f16, stride=(1, 50257)), T([512, 768], f16)), {})
cnt: 1, ((T([512, 50257], f16), T([50257, 768], f16)), {})
cnt: 6, ((T([512, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 6, ((T([3072, 512], f16, stride=(1, 3072)), T([512, 768], f16)), {})
cnt: 6, ((T([512, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 6, ((T([768, 512], f16, stride=(1, 768)), T([512, 3072], f16)), {})
cnt: 6, ((T([512, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 6, ((T([768, 512], f16, stride=(1, 768)), T([512, 768], f16)), {})
cnt: 6, ((T([512, 2304], f16), T([2304, 768], f16, stride=(1, 2304))), {})
cnt: 6, ((T([768, 512], f16, stride=(1, 768)), T([512, 2304], f16)), {})
Operator: aten.mul.Scalar
cnt: 6, ((T([1, 512, 3072], f16), 3.0), {})
Operator: aten.mul.Tensor
cnt: 12, ((T([1, 512, 3072], f16), 0.5), {})
cnt: 12, ((T([1, 512, 3072], f16), 0.044715), {})
cnt: 12, ((T([1, 512, 3072], f16), 0.7978845608028654), {})
cnt: 24, ((T([1, 512, 3072], f16), T([1, 512, 3072], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 13, ((T([1, 512, 768], f16), [768], T([768], f16), T([768], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 13, ((T([1, 512, 768], f16), T([1, 512, 768], f16), [768], T([1, 512, 1], f32), T([1, 512, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([511, 50257], f16), T([511], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([511, 50257], f16), T([511], i64), None, 1, -100), {})
Operator: aten.pow.Tensor_Scalar
cnt: 6, ((T([1, 512, 3072], f16), 3.0), {})
cnt: 6, ((T([1, 512, 3072], f16), 2.0), {})
Operator: aten.slice_backward.default
cnt: 1, ((T([1, 511, 50257], f16), [1, 511, 50257], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([1, 511, 50257], f16), [1, 512, 50257], 1, 0, -1, 1), {})
Operator: aten.split.Tensor
cnt: 6, ((T([1, 512, 2304], f16), 768, 2), {})
Operator: aten.sum.SymInt
cnt: 12, ((T([512, 768], f16), [0], True), {})
cnt: 6, ((T([512, 3072], f16), [0], True), {})
cnt: 6, ((T([512, 2304], f16), [0], True), {})
Operator: aten.tanh.default
cnt: 6, ((T([1, 512, 3072], f16),), {})
Operator: aten.tanh_backward.default
cnt: 6, ((T([1, 512, 3072], f16), T([1, 512, 3072], f16)), {})
Operator: aten.where.self
cnt: 12, ((T([1, 1, 512, 512], b8), T([1, 12, 512, 512], f16), T([], f16)), {})

View File

@ -0,0 +1,92 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([511, 30522], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([511, 30522], f16), T([511, 30522], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([1, 4, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([1, 4, 512, 512], f16), T([1, 4, 512, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([1, 1, 1, 512], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 12, ((T([4, 512, 512], f16), [1, 4, 512, 512]), {})
cnt: 12, ((T([4, 512, 64], f16), [1, 4, 512, 64]), {})
cnt: 24, ((T([1, 512, 4, 64], f16), [1, 512, 256]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([1, 512, 128], f16), T([1, 512, 128], f16)), {})
cnt: 12, ((T([1, 4, 512, 512], f16), T([1, 1, 1, 512], f16)), {})
cnt: 72, ((T([1, 512, 256], f16), T([1, 512, 256], f16)), {})
cnt: 1, ((T([30522, 128], f16), T([30522, 128], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([1, 512, 128], f16), T([1, 512, 128], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([256], f16), T([512, 128], f16), T([128, 256], f16, stride=(1, 128))), {})
cnt: 48, ((T([256], f16), T([512, 256], f16), T([256, 256], f16, stride=(1, 256))), {})
cnt: 12, ((T([1024], f16), T([512, 256], f16), T([256, 1024], f16, stride=(1, 256))), {})
cnt: 12, ((T([256], f16), T([512, 1024], f16), T([1024, 256], f16, stride=(1, 1024))), {})
cnt: 1, ((T([128], f16), T([512, 256], f16), T([256, 128], f16, stride=(1, 256))), {})
cnt: 1, ((T([30522], f16), T([512, 128], f16), T([128, 30522], f16, stride=(1, 128))), {})
Operator: aten.bmm.default
cnt: 24, ((T([4, 512, 64], f16, stride=(64, 256, 1)), T([4, 64, 512], f16, stride=(64, 1, 256))), {})
cnt: 24, ((T([4, 512, 512], f16), T([4, 512, 64], f16, stride=(64, 256, 1))), {})
cnt: 12, ((T([4, 512, 512], f16, stride=(262144, 1, 512)), T([4, 512, 64], f16, stride=(64, 256, 1))), {})
cnt: 12, ((T([4, 64, 512], f16, stride=(64, 1, 256)), T([4, 512, 512], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([1, 512], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([1, 512], i64), T([1, 512], i64)), {})
Operator: aten.div.Tensor
cnt: 24, ((T([1, 4, 512, 512], f16), 8.0), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 128], f16), T([1, 512], i64), 0), {})
cnt: 1, ((T([2, 128], f16), T([1, 512], i64)), {})
cnt: 1, ((T([512, 128], f16), T([1, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 512, 128], f16), T([1, 512], i64), 512, -1, False), {})
cnt: 1, ((T([1, 512, 128], f16), T([1, 512], i64), 2, -1, False), {})
cnt: 1, ((T([1, 512, 128], f16), T([1, 512], i64), 30522, 0, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([1, 512, 1024], f16),), {})
cnt: 1, ((T([1, 512, 128], f16),), {})
Operator: aten.gelu_backward.default
cnt: 1, ((T([1, 512, 128], f16), T([1, 512, 128], f16)), {})
cnt: 12, ((T([1, 512, 1024], f16), T([1, 512, 1024], f16)), {})
Operator: aten.mm.default
cnt: 1, ((T([512, 30522], f16), T([30522, 128], f16)), {})
cnt: 1, ((T([30522, 512], f16, stride=(1, 30522)), T([512, 128], f16)), {})
cnt: 1, ((T([512, 128], f16), T([128, 256], f16)), {})
cnt: 1, ((T([128, 512], f16, stride=(1, 128)), T([512, 256], f16)), {})
cnt: 12, ((T([512, 256], f16), T([256, 1024], f16)), {})
cnt: 12, ((T([256, 512], f16, stride=(1, 256)), T([512, 1024], f16)), {})
cnt: 12, ((T([512, 1024], f16), T([1024, 256], f16)), {})
cnt: 12, ((T([1024, 512], f16, stride=(1, 1024)), T([512, 256], f16)), {})
cnt: 36, ((T([512, 256], f16), T([256, 256], f16)), {})
cnt: 36, ((T([256, 512], f16, stride=(1, 256)), T([512, 256], f16)), {})
cnt: 12, ((T([512, 256], f16, stride=(1, 512)), T([256, 256], f16)), {})
cnt: 12, ((T([256, 512], f16), T([512, 256], f16)), {})
cnt: 1, ((T([512, 256], f16), T([256, 128], f16)), {})
cnt: 1, ((T([256, 512], f16, stride=(1, 256)), T([512, 128], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([1, 1, 1, 512], f16), -65504.0), {})
Operator: aten.native_layer_norm.default
cnt: 2, ((T([1, 512, 128], f16), [128], T([128], f16), T([128], f16), 1e-12), {})
cnt: 24, ((T([1, 512, 256], f16), [256], T([256], f16), T([256], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 2, ((T([1, 512, 128], f16), T([1, 512, 128], f16), [128], T([1, 512, 1], f32), T([1, 512, 1], f32), T([128], f16), T([128], f16), [True, True, True]), {})
cnt: 24, ((T([1, 512, 256], f16), T([1, 512, 256], f16), [256], T([1, 512, 1], f32), T([1, 512, 1], f32), T([256], f16), T([256], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([511, 30522], f16), T([511], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([511, 30522], f16), T([511], i64), None, 1, -100), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([1, 1, 1, 512], f16), 1.0), {})
Operator: aten.slice_backward.default
cnt: 1, ((T([1, 511, 30522], f16), [1, 511, 30522], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([1, 511, 30522], f16), [1, 512, 30522], 1, 0, -1, 1), {})
cnt: 1, ((T([1, 512, 30522], f16), [1, 512, 30522], 0, 0, 9223372036854775807, 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([512, 30522], f16), [0], True), {})
cnt: 1, ((T([512, 128], f16), [0], True), {})
cnt: 49, ((T([512, 256], f16), [0], True), {})
cnt: 12, ((T([512, 1024], f16), [0], True), {})
cnt: 12, ((T([512, 256], f16, stride=(1, 512)), [0], True), {})

View File

@ -0,0 +1,94 @@
Operator: aten._log_softmax.default
cnt: 2, ((T([64, 512], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 2, ((T([64, 512], f16), T([64, 512], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([64, 4, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([64, 4, 512, 512], f16), T([64, 4, 512, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([64, 1, 1, 512], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 36, ((T([64, 4, 512, 64], f16), [256, 512, 64]), {})
cnt: 12, ((T([64, 4, 64, 512], f16), [256, 64, 512]), {})
cnt: 12, ((T([256, 512, 512], f16), [64, 4, 512, 512]), {})
cnt: 12, ((T([256, 512, 64], f16), [64, 4, 512, 64]), {})
cnt: 24, ((T([64, 512, 4, 64], f16), [64, 512, 256]), {})
cnt: 12, ((T([64, 512, 256], f16), [32768, 256]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([64, 512, 128], f16), T([64, 512, 128], f16)), {})
cnt: 12, ((T([64, 4, 512, 512], f16), T([64, 1, 1, 512], f16)), {})
cnt: 72, ((T([64, 512, 256], f16), T([64, 512, 256], f16)), {})
cnt: 1, ((T([], f16), T([], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([64, 512, 128], f16), T([1, 512, 128], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([256], f16), T([32768, 128], f16), T([128, 256], f16, stride=(1, 128))), {})
cnt: 48, ((T([256], f16), T([32768, 256], f16), T([256, 256], f16, stride=(1, 256))), {})
cnt: 12, ((T([1024], f16), T([32768, 256], f16), T([256, 1024], f16, stride=(1, 256))), {})
cnt: 12, ((T([256], f16), T([32768, 1024], f16), T([1024, 256], f16, stride=(1, 1024))), {})
cnt: 1, ((T([2], f16), T([32768, 256], f16), T([256, 2], f16, stride=(1, 256))), {})
Operator: aten.bmm.default
cnt: 12, ((T([256, 512, 64], f16), T([256, 64, 512], f16)), {})
cnt: 12, ((T([256, 512, 512], f16), T([256, 512, 64], f16)), {})
cnt: 12, ((T([256, 512, 512], f16, stride=(262144, 1, 512)), T([256, 512, 64], f16)), {})
cnt: 12, ((T([256, 512, 64], f16), T([256, 64, 512], f16, stride=(32768, 1, 64))), {})
cnt: 12, ((T([256, 64, 512], f16, stride=(32768, 1, 64)), T([256, 512, 512], f16)), {})
cnt: 12, ((T([256, 512, 512], f16), T([256, 512, 64], f16, stride=(32768, 1, 512))), {})
Operator: aten.cat.default
cnt: 1, (([T([64, 512, 1], f16), T([64, 512, 1], f16)], 2), {})
Operator: aten.clamp.default
cnt: 2, ((T([64], i64), 0, 512), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 512], i64),), {})
cnt: 2, ((T([64], i64),), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 512], i64), T([64, 512], i64)), {})
cnt: 2, ((T([64], i64), T([64], i64)), {})
Operator: aten.div.Tensor
cnt: 24, ((T([64, 4, 512, 512], f16), 8.0), {})
cnt: 2, ((T([], f16), 2), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 128], f16), T([64, 512], i64), 0), {})
cnt: 1, ((T([2, 128], f16), T([64, 512], i64, stride=(0, 1))), {})
cnt: 1, ((T([512, 128], f16), T([1, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 512, 128], f16), T([1, 512], i64), 512, -1, False), {})
cnt: 1, ((T([64, 512, 128], f16), T([64, 512], i64, stride=(0, 1)), 2, -1, False), {})
cnt: 1, ((T([64, 512, 128], f16), T([64, 512], i64), 30522, 0, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([64, 512, 1024], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([64, 512, 1024], f16), T([64, 512, 1024], f16)), {})
Operator: aten.mm.default
cnt: 1, ((T([32768, 2], f16), T([2, 256], f16)), {})
cnt: 1, ((T([2, 32768], f16, stride=(1, 2)), T([32768, 256], f16)), {})
cnt: 12, ((T([32768, 256], f16), T([256, 1024], f16)), {})
cnt: 12, ((T([256, 32768], f16, stride=(1, 256)), T([32768, 1024], f16)), {})
cnt: 12, ((T([32768, 1024], f16), T([1024, 256], f16)), {})
cnt: 12, ((T([1024, 32768], f16, stride=(1, 1024)), T([32768, 256], f16)), {})
cnt: 48, ((T([32768, 256], f16), T([256, 256], f16)), {})
cnt: 48, ((T([256, 32768], f16, stride=(1, 256)), T([32768, 256], f16)), {})
cnt: 1, ((T([32768, 256], f16), T([256, 128], f16)), {})
cnt: 1, ((T([256, 32768], f16, stride=(1, 256)), T([32768, 128], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([64, 1, 1, 512], f16), -65504.0), {})
Operator: aten.native_layer_norm.default
cnt: 1, ((T([64, 512, 128], f16), [128], T([128], f16), T([128], f16), 1e-12), {})
cnt: 24, ((T([64, 512, 256], f16), [256], T([256], f16), T([256], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 24, ((T([64, 512, 256], f16), T([64, 512, 256], f16), [256], T([64, 512, 1], f32), T([64, 512, 1], f32), T([256], f16), T([256], f16), [True, True, True]), {})
cnt: 1, ((T([64, 512, 128], f16), T([64, 512, 128], f16), [128], T([64, 512, 1], f32), T([64, 512, 1], f32), T([128], f16), T([128], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 2, ((T([], f16), T([64, 512], f16), T([64], i64), None, 1, 512, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 2, ((T([64, 512], f16), T([64], i64), None, 1, 512), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([64, 1, 1, 512], f16), 1.0), {})
Operator: aten.split.Tensor
cnt: 1, ((T([64, 512, 2], f16), 1, -1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([32768, 2], f16), [0], True), {})
cnt: 61, ((T([32768, 256], f16), [0], True), {})
cnt: 12, ((T([32768, 1024], f16), [0], True), {})
cnt: 1, ((T([64, 512, 128], f16), [0], True), {})

View File

@ -0,0 +1,106 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([4, 2], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([4, 2], f16), T([4, 2], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([4, 12, 1024, 1024], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([4, 12, 1024, 1024], f16), T([4, 12, 1024, 1024], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 12, ((T([1, 1, 1024, 1024], u8),), {'dtype': torch.bool})
cnt: 12, ((T([], f16),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 36, ((T([4, 12, 1024, 64], f16), [48, 1024, 64]), {})
cnt: 12, ((T([4, 12, 64, 1024], f16), [48, 64, 1024]), {})
cnt: 12, ((T([48, 1024, 1024], f16), [4, 12, 1024, 1024]), {})
cnt: 12, ((T([48, 1024, 64], f16), [4, 12, 1024, 64]), {})
cnt: 1, ((T([4096, 2], f16), [4, 1024, 2]), {})
cnt: 24, ((T([4, 1024, 12, 64], f16), [4, 1024, 768]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([4, 1024, 768], f16), T([1, 1024, 768], f16)), {})
cnt: 48, ((T([4, 1024, 768], f16), T([4, 1024, 768], f16)), {})
cnt: 36, ((T([4, 1024, 3072], f16), T([4, 1024, 3072], f16)), {})
cnt: 12, ((T([4, 1024, 3072], f16), 1.0), {})
Operator: aten.addmm.default
cnt: 12, ((T([2304], f16), T([4096, 768], f16), T([768, 2304], f16)), {})
cnt: 12, ((T([768], f16), T([4096, 768], f16), T([768, 768], f16)), {})
cnt: 12, ((T([3072], f16), T([4096, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768], f16), T([4096, 3072], f16), T([3072, 768], f16)), {})
Operator: aten.bmm.default
cnt: 12, ((T([48, 1024, 64], f16), T([48, 64, 1024], f16)), {})
cnt: 12, ((T([48, 1024, 1024], f16), T([48, 1024, 64], f16)), {})
cnt: 12, ((T([48, 1024, 1024], f16, stride=(1048576, 1, 1024)), T([48, 1024, 64], f16)), {})
cnt: 12, ((T([48, 1024, 64], f16), T([48, 64, 1024], f16, stride=(65536, 1, 64))), {})
cnt: 12, ((T([48, 64, 1024], f16, stride=(65536, 1, 64)), T([48, 1024, 1024], f16)), {})
cnt: 12, ((T([48, 1024, 1024], f16), T([48, 1024, 64], f16, stride=(65536, 1, 1024))), {})
Operator: aten.cat.default
cnt: 12, (([T([4, 1024, 768], f16), T([4, 1024, 768], f16, stride=(786432, 1, 1024)), T([4, 1024, 768], f16)], 2), {})
Operator: aten.clone.default
cnt: 1, ((T([4, 1024], i64),), {})
cnt: 1, ((T([4], i64),), {})
Operator: aten.copy_.default
cnt: 1, ((T([4, 1024], i64), T([4, 1024], i64)), {})
cnt: 1, ((T([4], i64), T([4], i64)), {})
Operator: aten.div.Tensor
cnt: 24, ((T([4, 12, 1024, 1024], f16), T([], f16)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50257, 768], f16), T([4, 1024], i64)), {})
cnt: 1, ((T([1024, 768], f16), T([1, 1024], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 1024, 768], f16), T([1, 1024], i64), 1024, -1, False), {})
cnt: 1, ((T([4, 1024, 768], f16), T([4, 1024], i64), 50257, -1, False), {})
Operator: aten.index.Tensor
cnt: 1, ((T([4, 1024, 2], f16), [T([4], i64), T([4], i64)]), {})
Operator: aten.index_put.default
cnt: 1, ((T([4, 1024, 2], f16), [T([4], i64), T([4], i64)], T([4, 2], f16), True), {})
Operator: aten.mm.default
cnt: 1, ((T([4096, 768], f16), T([768, 2], f16, stride=(1, 768))), {})
cnt: 1, ((T([2, 4096], f16, stride=(1, 2)), T([4096, 768], f16)), {})
cnt: 1, ((T([4096, 2], f16), T([2, 768], f16)), {})
cnt: 12, ((T([4096, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072, 4096], f16, stride=(1, 3072)), T([4096, 768], f16)), {})
cnt: 12, ((T([4096, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 12, ((T([768, 4096], f16, stride=(1, 768)), T([4096, 3072], f16)), {})
cnt: 12, ((T([4096, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([768, 4096], f16, stride=(1, 768)), T([4096, 768], f16)), {})
cnt: 12, ((T([4096, 2304], f16), T([2304, 768], f16, stride=(1, 2304))), {})
cnt: 12, ((T([768, 4096], f16, stride=(1, 768)), T([4096, 2304], f16)), {})
Operator: aten.mul.Scalar
cnt: 12, ((T([4, 1024, 3072], f16), 3.0), {})
Operator: aten.mul.Tensor
cnt: 24, ((T([4, 1024, 3072], f16), 0.5), {})
cnt: 24, ((T([4, 1024, 3072], f16), 0.044715), {})
cnt: 24, ((T([4, 1024, 3072], f16), 0.7978845608028654), {})
cnt: 48, ((T([4, 1024, 3072], f16), T([4, 1024, 3072], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 25, ((T([4, 1024, 768], f16), [768], T([768], f16), T([768], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 25, ((T([4, 1024, 768], f16), T([4, 1024, 768], f16), [768], T([4, 1024, 1], f32), T([4, 1024, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.ne.Scalar
cnt: 1, ((T([4, 1024], i64), 0), {})
Operator: aten.new_zeros.default
cnt: 1, ((T([4, 2], f16), [4, 1024, 2]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([4, 2], f16), T([4], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([4, 2], f16), T([4], i64), None, 1, -100), {})
Operator: aten.pow.Tensor_Scalar
cnt: 12, ((T([4, 1024, 3072], f16), 3.0), {})
cnt: 12, ((T([4, 1024, 3072], f16), 2.0), {})
Operator: aten.split.Tensor
cnt: 12, ((T([4, 1024, 2304], f16), 768, 2), {})
Operator: aten.sub.Tensor
cnt: 1, ((T([4], i64), 1), {})
Operator: aten.sum.SymInt
cnt: 24, ((T([4096, 768], f16), [0], True), {})
cnt: 12, ((T([4096, 3072], f16), [0], True), {})
cnt: 12, ((T([4096, 2304], f16), [0], True), {})
cnt: 1, ((T([4, 1024, 768], f16), [0], True), {})
Operator: aten.sum.dim_IntList
cnt: 1, ((T([4, 1024], b8), [-1]), {})
Operator: aten.tanh.default
cnt: 12, ((T([4, 1024, 3072], f16),), {})
Operator: aten.tanh_backward.default
cnt: 12, ((T([4, 1024, 3072], f16), T([4, 1024, 3072], f16)), {})
Operator: aten.where.self
cnt: 24, ((T([1, 1, 1024, 1024], b8), T([4, 12, 1024, 1024], f16), T([], f16)), {})

View File

@ -0,0 +1,96 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([127, 50257], f32), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([127, 50257], f32), T([127, 50257], f32), 1, f32), {})
Operator: aten._softmax.default
cnt: 24, ((T([1, 16, 128, 128], f32), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 24, ((T([1, 16, 128, 128], f32), T([1, 16, 128, 128], f32), -1, f32), {})
Operator: aten._to_copy.default
cnt: 48, ((T([1, 16, 128, 128], f16, stride=(262144, 128, 2048, 1)),), {'dtype': f32})
cnt: 24, ((T([1, 1, 128, 128], u8, stride=(4194304, 4194304, 2048, 1)),), {'dtype': torch.bool})
cnt: 24, ((T([], f32),), {'dtype': f32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 24, ((T([1, 16, 128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([1, 128, 50257], f16),), {'dtype': f32})
cnt: 1, ((T([1, 128, 50257], f32),), {'dtype': f16})
cnt: 1, ((T([], f32),), {'dtype': f16})
cnt: 1, ((T([], f16),), {'dtype': f32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([1, 128, 50257], f32),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 24, ((T([1, 16, 128, 128], f16),), {'dtype': f32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 24, ((T([1, 16, 128, 128], f32, stride=(262144, 16384, 1, 128)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 24, ((T([1, 16, 128, 128], f32),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 72, ((T([128, 2048], f16), [1, 128, 2048]), {})
cnt: 24, ((T([16, 128, 128], f32), [1, 16, 128, 128]), {})
cnt: 24, ((T([16, 128, 128], f16), [1, 16, 128, 128]), {})
cnt: 1, ((T([128, 50257], f16), [1, 128, 50257]), {})
cnt: 48, ((T([1, 128, 16, 128], f16), [1, 128, 2048]), {})
Operator: aten.add.Tensor
cnt: 145, ((T([1, 128, 2048], f16), T([1, 128, 2048], f16)), {})
cnt: 72, ((T([1, 128, 8192], f16), T([1, 128, 8192], f16)), {})
cnt: 24, ((T([1, 128, 8192], f16), 1.0), {})
cnt: 1, ((T([50257, 2048], f16), T([50257, 2048], f16)), {})
Operator: aten.addmm.default
cnt: 24, ((T([2048], f16), T([128, 2048], f16), T([2048, 2048], f16, stride=(1, 2048))), {})
cnt: 24, ((T([8192], f16), T([128, 2048], f16), T([2048, 8192], f16, stride=(1, 2048))), {})
cnt: 24, ((T([2048], f16), T([128, 8192], f16), T([8192, 2048], f16, stride=(1, 8192))), {})
Operator: aten.bmm.default
cnt: 24, ((T([16, 128, 128], f32, stride=(128, 2048, 1)), T([16, 128, 128], f32, stride=(128, 1, 2048))), {})
cnt: 24, ((T([16, 128, 128], f16), T([16, 128, 128], f16, stride=(128, 2048, 1))), {})
cnt: 24, ((T([16, 128, 128], f16, stride=(16384, 1, 128)), T([16, 128, 128], f16, stride=(128, 2048, 1))), {})
cnt: 24, ((T([16, 128, 128], f16, stride=(128, 2048, 1)), T([16, 128, 128], f16, stride=(128, 1, 2048))), {})
cnt: 24, ((T([16, 128, 128], f32, stride=(128, 1, 2048)), T([16, 128, 128], f32)), {})
cnt: 24, ((T([16, 128, 128], f32), T([16, 128, 128], f32, stride=(128, 2048, 1))), {})
Operator: aten.clone.default
cnt: 2, ((T([1, 128], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([1, 128], i64), T([1, 128], i64)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50257, 2048], f16), T([1, 128], i64)), {})
cnt: 1, ((T([2048, 2048], f16), T([1, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 128, 2048], f16), T([1, 128], i64), 2048, -1, False), {})
cnt: 1, ((T([1, 128, 2048], f16), T([1, 128], i64), 50257, -1, False), {})
Operator: aten.mm.default
cnt: 72, ((T([128, 2048], f16), T([2048, 2048], f16, stride=(1, 2048))), {})
cnt: 1, ((T([128, 2048], f16), T([2048, 50257], f16, stride=(1, 2048))), {})
cnt: 1, ((T([50257, 128], f16, stride=(1, 50257)), T([128, 2048], f16)), {})
cnt: 1, ((T([128, 50257], f16), T([50257, 2048], f16)), {})
cnt: 24, ((T([128, 2048], f16), T([2048, 8192], f16)), {})
cnt: 24, ((T([2048, 128], f16, stride=(1, 2048)), T([128, 8192], f16)), {})
cnt: 24, ((T([128, 8192], f16), T([8192, 2048], f16)), {})
cnt: 24, ((T([8192, 128], f16, stride=(1, 8192)), T([128, 2048], f16)), {})
cnt: 72, ((T([128, 2048], f16), T([2048, 2048], f16)), {})
cnt: 72, ((T([2048, 128], f16, stride=(1, 2048)), T([128, 2048], f16)), {})
cnt: 24, ((T([2048, 128], f16), T([128, 2048], f16)), {})
cnt: 24, ((T([128, 2048], f16, stride=(1, 128)), T([2048, 2048], f16)), {})
Operator: aten.mul.Scalar
cnt: 24, ((T([1, 128, 8192], f16), 3.0), {})
Operator: aten.mul.Tensor
cnt: 48, ((T([1, 128, 8192], f16), 0.5), {})
cnt: 48, ((T([1, 128, 8192], f16), 0.044715), {})
cnt: 48, ((T([1, 128, 8192], f16), 0.7978845608028654), {})
cnt: 96, ((T([1, 128, 8192], f16), T([1, 128, 8192], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 49, ((T([1, 128, 2048], f16), [2048], T([2048], f16), T([2048], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 49, ((T([1, 128, 2048], f16), T([1, 128, 2048], f16), [2048], T([1, 128, 1], f32), T([1, 128, 1], f32), T([2048], f16), T([2048], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f32), T([127, 50257], f32), T([127], i64), None, 1, -100, T([], f32)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([127, 50257], f32), T([127], i64), None, 1, -100), {})
Operator: aten.pow.Tensor_Scalar
cnt: 24, ((T([1, 128, 8192], f16), 3.0), {})
cnt: 24, ((T([1, 128, 8192], f16), 2.0), {})
Operator: aten.slice_backward.default
cnt: 1, ((T([1, 127, 50257], f32), [1, 127, 50257], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([1, 127, 50257], f32), [1, 128, 50257], 1, 0, -1, 1), {})
Operator: aten.sum.SymInt
cnt: 48, ((T([128, 2048], f16), [0], True), {})
cnt: 24, ((T([128, 8192], f16), [0], True), {})
Operator: aten.tanh.default
cnt: 24, ((T([1, 128, 8192], f16),), {})
Operator: aten.tanh_backward.default
cnt: 24, ((T([1, 128, 8192], f16), T([1, 128, 8192], f16)), {})
Operator: aten.where.self
cnt: 48, ((T([1, 1, 128, 128], b8), T([1, 16, 128, 128], f32), T([], f32)), {})

View File

@ -0,0 +1,101 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([1, 2], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([1, 2], f16), T([1, 2], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 24, ((T([1, 16, 128, 128], f32), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 24, ((T([1, 16, 128, 128], f32), T([1, 16, 128, 128], f32), -1, f32), {})
Operator: aten._to_copy.default
cnt: 48, ((T([1, 16, 128, 128], f16, stride=(262144, 128, 2048, 1)),), {'dtype': f32})
cnt: 24, ((T([1, 1, 128, 128], u8, stride=(4194304, 4194304, 2048, 1)),), {'dtype': torch.bool})
cnt: 24, ((T([], f32),), {'dtype': f32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 24, ((T([1, 16, 128, 128], f32),), {'dtype': f16})
cnt: 24, ((T([1, 16, 128, 128], f16),), {'dtype': f32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 24, ((T([1, 16, 128, 128], f32, stride=(262144, 16384, 1, 128)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 24, ((T([1, 16, 128, 128], f32),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 72, ((T([128, 2048], f16), [1, 128, 2048]), {})
cnt: 24, ((T([16, 128, 128], f32), [1, 16, 128, 128]), {})
cnt: 24, ((T([16, 128, 128], f16), [1, 16, 128, 128]), {})
cnt: 1, ((T([128, 2], f16), [1, 128, 2]), {})
cnt: 48, ((T([1, 128, 16, 128], f16), [1, 128, 2048]), {})
Operator: aten.add.Tensor
cnt: 145, ((T([1, 128, 2048], f16), T([1, 128, 2048], f16)), {})
cnt: 72, ((T([1, 128, 8192], f16), T([1, 128, 8192], f16)), {})
cnt: 24, ((T([1, 128, 8192], f16), 1.0), {})
Operator: aten.addmm.default
cnt: 24, ((T([2048], f16), T([128, 2048], f16), T([2048, 2048], f16, stride=(1, 2048))), {})
cnt: 24, ((T([8192], f16), T([128, 2048], f16), T([2048, 8192], f16, stride=(1, 2048))), {})
cnt: 24, ((T([2048], f16), T([128, 8192], f16), T([8192, 2048], f16, stride=(1, 8192))), {})
Operator: aten.bmm.default
cnt: 24, ((T([16, 128, 128], f32, stride=(128, 2048, 1)), T([16, 128, 128], f32, stride=(128, 1, 2048))), {})
cnt: 24, ((T([16, 128, 128], f16), T([16, 128, 128], f16, stride=(128, 2048, 1))), {})
cnt: 24, ((T([16, 128, 128], f16, stride=(16384, 1, 128)), T([16, 128, 128], f16, stride=(128, 2048, 1))), {})
cnt: 24, ((T([16, 128, 128], f16, stride=(128, 2048, 1)), T([16, 128, 128], f16, stride=(128, 1, 2048))), {})
cnt: 24, ((T([16, 128, 128], f32, stride=(128, 1, 2048)), T([16, 128, 128], f32)), {})
cnt: 24, ((T([16, 128, 128], f32), T([16, 128, 128], f32, stride=(128, 2048, 1))), {})
Operator: aten.clone.default
cnt: 1, ((T([1, 128], i64),), {})
cnt: 1, ((T([1], i64),), {})
Operator: aten.copy_.default
cnt: 1, ((T([1, 128], i64), T([1, 128], i64)), {})
cnt: 1, ((T([1], i64), T([1], i64)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50257, 2048], f16), T([1, 128], i64)), {})
cnt: 1, ((T([2048, 2048], f16), T([1, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 128, 2048], f16), T([1, 128], i64), 2048, -1, False), {})
cnt: 1, ((T([1, 128, 2048], f16), T([1, 128], i64), 50257, -1, False), {})
Operator: aten.index.Tensor
cnt: 1, ((T([1, 128, 2], f16), [T([1], i64), T([1], i64)]), {})
Operator: aten.index_put.default
cnt: 1, ((T([1, 128, 2], f16), [T([1], i64), T([1], i64)], T([1, 2], f16), True), {})
Operator: aten.mm.default
cnt: 72, ((T([128, 2048], f16), T([2048, 2048], f16, stride=(1, 2048))), {})
cnt: 1, ((T([128, 2048], f16), T([2048, 2], f16, stride=(1, 2048))), {})
cnt: 1, ((T([2, 128], f16, stride=(1, 2)), T([128, 2048], f16)), {})
cnt: 1, ((T([128, 2], f16), T([2, 2048], f16)), {})
cnt: 24, ((T([128, 2048], f16), T([2048, 8192], f16)), {})
cnt: 24, ((T([2048, 128], f16, stride=(1, 2048)), T([128, 8192], f16)), {})
cnt: 24, ((T([128, 8192], f16), T([8192, 2048], f16)), {})
cnt: 24, ((T([8192, 128], f16, stride=(1, 8192)), T([128, 2048], f16)), {})
cnt: 72, ((T([128, 2048], f16), T([2048, 2048], f16)), {})
cnt: 72, ((T([2048, 128], f16, stride=(1, 2048)), T([128, 2048], f16)), {})
cnt: 24, ((T([2048, 128], f16), T([128, 2048], f16)), {})
cnt: 24, ((T([128, 2048], f16, stride=(1, 128)), T([2048, 2048], f16)), {})
Operator: aten.mul.Scalar
cnt: 24, ((T([1, 128, 8192], f16), 3.0), {})
Operator: aten.mul.Tensor
cnt: 48, ((T([1, 128, 8192], f16), 0.5), {})
cnt: 48, ((T([1, 128, 8192], f16), 0.044715), {})
cnt: 48, ((T([1, 128, 8192], f16), 0.7978845608028654), {})
cnt: 96, ((T([1, 128, 8192], f16), T([1, 128, 8192], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 49, ((T([1, 128, 2048], f16), [2048], T([2048], f16), T([2048], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 49, ((T([1, 128, 2048], f16), T([1, 128, 2048], f16), [2048], T([1, 128, 1], f32), T([1, 128, 1], f32), T([2048], f16), T([2048], f16), [True, True, True]), {})
Operator: aten.ne.Scalar
cnt: 1, ((T([1, 128], i64), 0), {})
Operator: aten.new_zeros.default
cnt: 1, ((T([1, 2], f16), [1, 128, 2]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([1, 2], f16), T([1], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([1, 2], f16), T([1], i64), None, 1, -100), {})
Operator: aten.pow.Tensor_Scalar
cnt: 24, ((T([1, 128, 8192], f16), 3.0), {})
cnt: 24, ((T([1, 128, 8192], f16), 2.0), {})
Operator: aten.sub.Tensor
cnt: 1, ((T([1], i64), 1), {})
Operator: aten.sum.SymInt
cnt: 48, ((T([128, 2048], f16), [0], True), {})
cnt: 24, ((T([128, 8192], f16), [0], True), {})
Operator: aten.sum.dim_IntList
cnt: 1, ((T([1, 128], b8), [-1]), {})
Operator: aten.tanh.default
cnt: 24, ((T([1, 128, 8192], f16),), {})
Operator: aten.tanh_backward.default
cnt: 24, ((T([1, 128, 8192], f16), T([1, 128, 8192], f16)), {})
Operator: aten.where.self
cnt: 48, ((T([1, 1, 128, 128], b8), T([1, 16, 128, 128], f32), T([], f32)), {})

View File

@ -0,0 +1,83 @@
Operator: aten._fft_c2c.default
cnt: 12, ((T([1, 512, 768], c32), [1, 2], 0, True), {})
cnt: 12, ((T([1, 512, 768], c32), [1, 2], 0, False), {})
Operator: aten._log_softmax.default
cnt: 1, ((T([512, 32000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([512, 32000], f16), T([512, 32000], f16), 1, f16), {})
Operator: aten._to_copy.default
cnt: 12, ((T([1, 512, 768], f16),), {'dtype': c32})
Operator: aten.add.Tensor
cnt: 28, ((T([1, 512, 768], f16), T([1, 512, 768], f16)), {})
cnt: 24, ((T([1, 512, 768], f16), T([1, 512, 768], f16, stride=(786432, 1536, 2))), {})
cnt: 36, ((T([1, 512, 3072], f16), T([1, 512, 3072], f16)), {})
cnt: 12, ((T([1, 512, 3072], f16), 1.0), {})
cnt: 1, ((T([1, 512, 768], f16), 1.0), {})
cnt: 1, ((T([32000, 768], f16), T([32000, 768], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([1, 512, 768], f16), T([1, 512, 768], f16)), {})
Operator: aten.addmm.default
cnt: 2, ((T([768], f16), T([512, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([512, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([512, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([768], f16), T([1, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 1, ((T([32000], f16), T([512, 768], f16), T([768, 32000], f16, stride=(1, 768))), {})
Operator: aten.clone.default
cnt: 2, ((T([1, 512], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([1, 512], i64), T([1, 512], i64)), {})
Operator: aten.embedding.default
cnt: 1, ((T([32000, 768], f16), T([1, 512], i64), 3), {})
cnt: 1, ((T([4, 768], f16), T([1, 512], i64)), {})
cnt: 1, ((T([512, 768], f16), T([1, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 512, -1, False), {})
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 4, -1, False), {})
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 32000, 3, False), {})
Operator: aten.mm.default
cnt: 1, ((T([512, 32000], f16), T([32000, 768], f16)), {})
cnt: 1, ((T([32000, 512], f16, stride=(1, 32000)), T([512, 768], f16)), {})
cnt: 2, ((T([512, 768], f16), T([768, 768], f16)), {})
cnt: 2, ((T([768, 512], f16, stride=(1, 768)), T([512, 768], f16)), {})
cnt: 12, ((T([512, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 512], f16, stride=(1, 768)), T([512, 3072], f16)), {})
cnt: 12, ((T([512, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 512], f16, stride=(1, 3072)), T([512, 768], f16)), {})
Operator: aten.mul.Scalar
cnt: 1, ((T([1, 512, 768], f16), 3.0), {})
cnt: 12, ((T([1, 512, 3072], f16), 3.0), {})
Operator: aten.mul.Tensor
cnt: 24, ((T([1, 512, 3072], f16), 0.5), {})
cnt: 24, ((T([1, 512, 3072], f16), 0.044715), {})
cnt: 24, ((T([1, 512, 3072], f16), 0.7978845608028654), {})
cnt: 48, ((T([1, 512, 3072], f16), T([1, 512, 3072], f16)), {})
cnt: 2, ((T([1, 512, 768], f16), 0.5), {})
cnt: 2, ((T([1, 512, 768], f16), 0.044715), {})
cnt: 2, ((T([1, 512, 768], f16), 0.7978845608028654), {})
cnt: 4, ((T([1, 512, 768], f16), T([1, 512, 768], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 26, ((T([1, 512, 768], f16), [768], T([768], f16), T([768], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 26, ((T([1, 512, 768], f16), T([1, 512, 768], f16), [768], T([1, 512, 1], f32), T([1, 512, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([512, 32000], f16), T([512], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([512, 32000], f16), T([512], i64), None, 1, -100), {})
Operator: aten.pow.Tensor_Scalar
cnt: 12, ((T([1, 512, 3072], f16), 3.0), {})
cnt: 1, ((T([1, 512, 768], f16), 3.0), {})
cnt: 1, ((T([1, 512, 768], f16), 2.0), {})
cnt: 12, ((T([1, 512, 3072], f16), 2.0), {})
Operator: aten.select_backward.default
cnt: 12, ((T([1, 512, 768], f16), [1, 512, 768, 2], 3, 0), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([512, 32000], f16), [0], True), {})
cnt: 14, ((T([512, 768], f16), [0], True), {})
cnt: 12, ((T([512, 3072], f16), [0], True), {})
Operator: aten.tanh.default
cnt: 12, ((T([1, 512, 3072], f16),), {})
cnt: 1, ((T([1, 768], f16),), {})
cnt: 1, ((T([1, 512, 768], f16),), {})
Operator: aten.tanh_backward.default
cnt: 1, ((T([1, 512, 768], f16), T([1, 512, 768], f16)), {})
cnt: 12, ((T([1, 512, 3072], f16), T([1, 512, 3072], f16)), {})

View File

@ -0,0 +1,90 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([8192, 30522], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([8192, 30522], f16), T([8192, 30522], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([16, 12, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([16, 12, 512, 512], f16), T([16, 12, 512, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([16, 1, 1, 512], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 36, ((T([16, 12, 512, 64], f16), [192, 512, 64]), {})
cnt: 12, ((T([16, 12, 64, 512], f16), [192, 64, 512]), {})
cnt: 12, ((T([192, 512, 512], f16), [16, 12, 512, 512]), {})
cnt: 12, ((T([192, 512, 64], f16), [16, 12, 512, 64]), {})
cnt: 24, ((T([16, 512, 12, 64], f16), [16, 512, 768]), {})
cnt: 12, ((T([16, 512, 768], f16), [8192, 768]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([16, 512, 768], f16), T([1, 512, 768], f16)), {})
cnt: 79, ((T([16, 512, 768], f16), T([16, 512, 768], f16)), {})
cnt: 12, ((T([16, 12, 512, 512], f16), T([16, 1, 1, 512], f16)), {})
cnt: 2, ((T([1024, 768], f16), T([1024, 768], f16)), {})
cnt: 1, ((T([30522, 768], f16), T([30522, 768], f16)), {})
Operator: aten.addmm.default
cnt: 49, ((T([768], f16), T([8192, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([8192, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([8192, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([768], f16), T([16, 768], f16, stride=(393216, 1)), T([768, 768], f16, stride=(1, 768))), {})
cnt: 1, ((T([30522], f16), T([8192, 768], f16), T([768, 30522], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 12, ((T([192, 512, 64], f16), T([192, 64, 512], f16)), {})
cnt: 12, ((T([192, 512, 512], f16), T([192, 512, 64], f16)), {})
cnt: 12, ((T([192, 512, 512], f16, stride=(262144, 1, 512)), T([192, 512, 64], f16)), {})
cnt: 12, ((T([192, 512, 64], f16), T([192, 64, 512], f16, stride=(32768, 1, 64))), {})
cnt: 12, ((T([192, 64, 512], f16, stride=(32768, 1, 64)), T([192, 512, 512], f16)), {})
cnt: 12, ((T([192, 512, 512], f16), T([192, 512, 64], f16, stride=(32768, 1, 512))), {})
Operator: aten.clone.default
cnt: 2, ((T([16, 512], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([16, 512], i64), T([16, 512], i64)), {})
Operator: aten.div.Tensor
cnt: 24, ((T([16, 12, 512, 512], f16), 8.0), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 768], f16), T([16, 512], i64), 0), {})
cnt: 1, ((T([512, 768], f16), T([1, 512], i64)), {})
cnt: 4, ((T([1024, 768], f16), T([16, 512], i64, stride=(2048, 4))), {})
cnt: 2, ((T([1024, 768], f16), T([16, 512], i64)), {})
cnt: 1, ((T([2, 768], f16), T([16, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([16, 512, 768], f16), T([16, 512], i64), 2, -1, False), {})
cnt: 2, ((T([16, 512, 768], f16), T([16, 512], i64), 1024, -1, False), {})
cnt: 4, ((T([16, 512, 768], f16), T([16, 512], i64, stride=(2048, 4)), 1024, -1, False), {})
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 512, -1, False), {})
cnt: 1, ((T([16, 512, 768], f16), T([16, 512], i64), 30522, 0, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([16, 512, 3072], f16),), {})
cnt: 1, ((T([16, 512, 768], f16),), {})
Operator: aten.gelu_backward.default
cnt: 1, ((T([16, 512, 768], f16), T([16, 512, 768], f16)), {})
cnt: 12, ((T([16, 512, 3072], f16), T([16, 512, 3072], f16)), {})
Operator: aten.mm.default
cnt: 1, ((T([8192, 30522], f16), T([30522, 768], f16)), {})
cnt: 1, ((T([30522, 8192], f16, stride=(1, 30522)), T([8192, 768], f16)), {})
cnt: 49, ((T([8192, 768], f16), T([768, 768], f16)), {})
cnt: 49, ((T([768, 8192], f16, stride=(1, 768)), T([8192, 768], f16)), {})
cnt: 12, ((T([8192, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 8192], f16, stride=(1, 768)), T([8192, 3072], f16)), {})
cnt: 12, ((T([8192, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 8192], f16, stride=(1, 3072)), T([8192, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([16, 1, 1, 512], f16), -65504.0), {})
Operator: aten.native_layer_norm.default
cnt: 26, ((T([16, 512, 768], f16), [768], T([768], f16), T([768], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 26, ((T([16, 512, 768], f16), T([16, 512, 768], f16), [768], T([16, 512, 1], f32), T([16, 512, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([8192, 30522], f16), T([8192], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([8192, 30522], f16), T([8192], i64), None, 1, -100), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([16, 1, 1, 512], f16), 1.0), {})
Operator: aten.sub.Tensor
cnt: 2, ((T([16, 512], i64, stride=(2048, 4)), T([16, 512], i64, stride=(2048, 4))), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([8192, 30522], f16), [0], True), {})
cnt: 61, ((T([8192, 768], f16), [0], True), {})
cnt: 12, ((T([8192, 3072], f16), [0], True), {})
cnt: 1, ((T([16, 512, 768], f16), [0], True), {})
Operator: aten.tanh.default
cnt: 1, ((T([16, 768], f16),), {})

View File

@ -0,0 +1,98 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([16, 2], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([16, 2], f16), T([16, 2], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([16, 12, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([16, 12, 512, 512], f16), T([16, 12, 512, 512], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([16, 1, 1, 512], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 36, ((T([16, 12, 512, 64], f16), [192, 512, 64]), {})
cnt: 12, ((T([16, 12, 64, 512], f16), [192, 64, 512]), {})
cnt: 12, ((T([192, 512, 512], f16), [16, 12, 512, 512]), {})
cnt: 12, ((T([192, 512, 64], f16), [16, 12, 512, 64]), {})
cnt: 24, ((T([16, 512, 12, 64], f16), [16, 512, 768]), {})
cnt: 12, ((T([16, 512, 768], f16), [8192, 768]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([16, 512, 768], f16), T([1, 512, 768], f16)), {})
cnt: 79, ((T([16, 512, 768], f16), T([16, 512, 768], f16)), {})
cnt: 12, ((T([16, 12, 512, 512], f16), T([16, 1, 1, 512], f16)), {})
cnt: 2, ((T([1024, 768], f16), T([1024, 768], f16)), {})
Operator: aten.addmm.default
cnt: 48, ((T([768], f16), T([8192, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([8192, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([8192, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([768], f16), T([16, 768], f16, stride=(393216, 1)), T([768, 768], f16, stride=(1, 768))), {})
cnt: 1, ((T([2], f16), T([16, 768], f16), T([768, 2], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 12, ((T([192, 512, 64], f16), T([192, 64, 512], f16)), {})
cnt: 12, ((T([192, 512, 512], f16), T([192, 512, 64], f16)), {})
cnt: 12, ((T([192, 512, 512], f16, stride=(262144, 1, 512)), T([192, 512, 64], f16)), {})
cnt: 12, ((T([192, 512, 64], f16), T([192, 64, 512], f16, stride=(32768, 1, 64))), {})
cnt: 12, ((T([192, 64, 512], f16, stride=(32768, 1, 64)), T([192, 512, 512], f16)), {})
cnt: 12, ((T([192, 512, 512], f16), T([192, 512, 64], f16, stride=(32768, 1, 512))), {})
Operator: aten.clone.default
cnt: 1, ((T([16, 512], i64),), {})
cnt: 1, ((T([16], i64),), {})
Operator: aten.copy_.default
cnt: 1, ((T([16, 512], i64), T([16, 512], i64)), {})
cnt: 1, ((T([16], i64), T([16], i64)), {})
Operator: aten.div.Tensor
cnt: 24, ((T([16, 12, 512, 512], f16), 8.0), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 768], f16), T([16, 512], i64), 0), {})
cnt: 1, ((T([512, 768], f16), T([1, 512], i64)), {})
cnt: 4, ((T([1024, 768], f16), T([16, 512], i64, stride=(2048, 4))), {})
cnt: 2, ((T([1024, 768], f16), T([16, 512], i64)), {})
cnt: 1, ((T([2, 768], f16), T([16, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([16, 512, 768], f16), T([16, 512], i64), 2, -1, False), {})
cnt: 2, ((T([16, 512, 768], f16), T([16, 512], i64), 1024, -1, False), {})
cnt: 4, ((T([16, 512, 768], f16), T([16, 512], i64, stride=(2048, 4)), 1024, -1, False), {})
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 512, -1, False), {})
cnt: 1, ((T([16, 512, 768], f16), T([16, 512], i64), 30522, 0, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([16, 512, 3072], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([16, 512, 3072], f16), T([16, 512, 3072], f16)), {})
Operator: aten.mm.default
cnt: 1, ((T([16, 2], f16), T([2, 768], f16)), {})
cnt: 1, ((T([2, 16], f16, stride=(1, 2)), T([16, 768], f16)), {})
cnt: 1, ((T([16, 768], f16), T([768, 768], f16)), {})
cnt: 1, ((T([768, 16], f16, stride=(1, 768)), T([16, 768], f16, stride=(393216, 1))), {})
cnt: 12, ((T([8192, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 8192], f16, stride=(1, 768)), T([8192, 3072], f16)), {})
cnt: 12, ((T([8192, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 8192], f16, stride=(1, 3072)), T([8192, 768], f16)), {})
cnt: 48, ((T([8192, 768], f16), T([768, 768], f16)), {})
cnt: 48, ((T([768, 8192], f16, stride=(1, 768)), T([8192, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([16, 1, 1, 512], f16), -65504.0), {})
Operator: aten.native_layer_norm.default
cnt: 25, ((T([16, 512, 768], f16), [768], T([768], f16), T([768], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 25, ((T([16, 512, 768], f16), T([16, 512, 768], f16), [768], T([16, 512, 1], f32), T([16, 512, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([16, 2], f16), T([16], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([16, 2], f16), T([16], i64), None, 1, -100), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([16, 1, 1, 512], f16), 1.0), {})
Operator: aten.select_backward.default
cnt: 1, ((T([16, 768], f16), [16, 512, 768], 1, 0), {})
Operator: aten.slice_backward.default
cnt: 1, ((T([16, 512, 768], f16), [16, 512, 768], 0, 0, 9223372036854775807, 1), {})
Operator: aten.sub.Tensor
cnt: 2, ((T([16, 512], i64, stride=(2048, 4)), T([16, 512], i64, stride=(2048, 4))), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([16, 2], f16), [0], True), {})
cnt: 1, ((T([16, 768], f16), [0], True), {})
cnt: 60, ((T([8192, 768], f16), [0], True), {})
cnt: 12, ((T([8192, 3072], f16), [0], True), {})
cnt: 1, ((T([16, 512, 768], f16), [0], True), {})
Operator: aten.tanh.default
cnt: 1, ((T([16, 768], f16),), {})
Operator: aten.tanh_backward.default
cnt: 1, ((T([16, 768], f16), T([16, 768], f16)), {})

View File

@ -0,0 +1,88 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([256, 128112], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([256, 128112], f16), T([256, 128112], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 36, ((T([32, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 36, ((T([32, 128, 128], f16), T([32, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 2, ((T([2, 128], b8),), {'dtype': i32})
cnt: 2, ((T([2, 128], i64),), {'dtype': i32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 2, ((T([2, 128], i32),), {'dtype': i64})
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([2, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 108, ((T([2, 128, 16, 64], f16), [2, 128, 1024]), {})
cnt: 1, ((T([256, 128112], f16), [2, 128, 128112]), {})
cnt: 36, ((T([2, 16, 128, 64], f16), [32, 128, 64]), {})
cnt: 36, ((T([2, 128, 1024], f16), [256, 1024]), {})
Operator: aten.add.Tensor
cnt: 2, ((T([2, 128], i32), 0), {})
cnt: 2, ((T([2, 128], i64), 1), {})
cnt: 193, ((T([2, 128, 1024], f16), T([2, 128, 1024], f16)), {})
cnt: 1, ((T([128], i64), 1), {})
cnt: 12, ((T([2, 16, 128, 128], f16), T([2, 1, 128, 128], f16)), {})
cnt: 2, ((T([128112, 1024], f16), T([128112, 1024], f16)), {})
Operator: aten.addmm.default
cnt: 144, ((T([1024], f16), T([256, 1024], f16), T([1024, 1024], f16, stride=(1, 1024))), {})
cnt: 24, ((T([4096], f16), T([256, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 24, ((T([1024], f16), T([256, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
Operator: aten.any.default
cnt: 24, ((T([2, 128, 1024], b8),), {})
Operator: aten.bmm.default
cnt: 72, ((T([32, 128, 64], f16), T([32, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 72, ((T([32, 128, 128], f16), T([32, 128, 64], f16)), {})
cnt: 36, ((T([32, 128, 128], f16, stride=(16384, 1, 128)), T([32, 128, 64], f16)), {})
cnt: 36, ((T([32, 64, 128], f16, stride=(8192, 1, 64)), T([32, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 3, ((T([2, 128], i64),), {})
Operator: aten.copy_.default
cnt: 3, ((T([2, 128], i64), T([2, 128], i64)), {})
Operator: aten.cumsum.default
cnt: 2, ((T([2, 128], i32), 1), {})
Operator: aten.embedding.default
cnt: 2, ((T([128112, 1024], f16), T([2, 128], i64), 1), {})
Operator: aten.embedding_dense_backward.default
cnt: 2, ((T([2, 128, 1024], f16), T([2, 128], i64), 128112, 1, False), {})
Operator: aten.index_select.default
cnt: 2, ((T([1026, 1024], f16), 0, T([256], i64)), {})
Operator: aten.isinf.default
cnt: 12, ((T([2, 128, 1024], f16),), {})
Operator: aten.isnan.default
cnt: 12, ((T([2, 128, 1024], f16),), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([256, 1024], f16), T([1024, 128112], f16, stride=(1, 1024))), {})
cnt: 1, ((T([128112, 256], f16, stride=(1, 128112)), T([256, 1024], f16)), {})
cnt: 1, ((T([256, 128112], f16), T([128112, 1024], f16)), {})
cnt: 24, ((T([256, 1024], f16), T([1024, 4096], f16)), {})
cnt: 24, ((T([1024, 256], f16, stride=(1, 1024)), T([256, 4096], f16)), {})
cnt: 24, ((T([256, 4096], f16), T([4096, 1024], f16)), {})
cnt: 24, ((T([4096, 256], f16, stride=(1, 4096)), T([256, 1024], f16)), {})
cnt: 144, ((T([256, 1024], f16), T([1024, 1024], f16)), {})
cnt: 144, ((T([1024, 256], f16, stride=(1, 1024)), T([256, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 4, ((T([2, 128, 1024], f16), 32.0), {})
cnt: 2, ((T([2, 128], i32), T([2, 128], i32)), {})
cnt: 72, ((T([2, 128, 1024], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 62, ((T([2, 128, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 62, ((T([2, 128, 1024], f16), T([2, 128, 1024], f16), [1024], T([2, 128, 1], f32), T([2, 128, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
Operator: aten.ne.Scalar
cnt: 2, ((T([2, 128], i64), 1), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([256, 128112], f16), T([256], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([256, 128112], f16), T([256], i64), None, 1, -100), {})
Operator: aten.relu.default
cnt: 24, ((T([2, 128, 4096], f16),), {})
Operator: aten.sum.SymInt
cnt: 168, ((T([256, 1024], f16), [0], True), {})
cnt: 24, ((T([256, 4096], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 24, ((T([2, 128, 4096], f16), T([2, 128, 4096], f16), 0), {})

View File

@ -0,0 +1,73 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([2048, 50265], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([2048, 50265], f16), T([2048, 50265], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([256, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([256, 128, 128], f16), T([256, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([16, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 36, ((T([16, 128, 16, 64], f16), [16, 128, 1024]), {})
cnt: 1, ((T([2048, 50265], f16), [16, 128, 50265]), {})
cnt: 12, ((T([16, 16, 128, 64], f16), [256, 128, 64]), {})
cnt: 12, ((T([16, 128, 1024], f16), [2048, 1024]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([128], i64), 1), {})
cnt: 1, ((T([16, 128], i64, stride=(0, 1)), 2), {})
cnt: 73, ((T([16, 128, 1024], f16), T([16, 128, 1024], f16)), {})
cnt: 12, ((T([16, 16, 128, 128], f16), T([16, 1, 128, 128], f16)), {})
cnt: 1, ((T([50265, 1024], f16), T([50265, 1024], f16)), {})
Operator: aten.addmm.default
cnt: 48, ((T([1024], f16), T([2048, 1024], f16), T([1024, 1024], f16, stride=(1, 1024))), {})
cnt: 12, ((T([4096], f16), T([2048, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 12, ((T([1024], f16), T([2048, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
Operator: aten.bmm.default
cnt: 24, ((T([256, 128, 64], f16), T([256, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 24, ((T([256, 128, 128], f16), T([256, 128, 64], f16)), {})
cnt: 12, ((T([256, 128, 128], f16, stride=(16384, 1, 128)), T([256, 128, 64], f16)), {})
cnt: 12, ((T([256, 64, 128], f16, stride=(8192, 1, 64)), T([256, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([16, 128], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([16, 128], i64), T([16, 128], i64)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50265, 1024], f16), T([16, 128], i64), 1), {})
cnt: 1, ((T([1026, 1024], f16), T([16, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([16, 128, 1024], f16), T([16, 128], i64), 1026, -1, False), {})
cnt: 1, ((T([16, 128, 1024], f16), T([16, 128], i64), 50265, 1, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([16, 128, 4096], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([16, 128, 4096], f16), T([16, 128, 4096], f16)), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([2048, 1024], f16), T([1024, 50265], f16, stride=(1, 1024))), {})
cnt: 1, ((T([50265, 2048], f16, stride=(1, 50265)), T([2048, 1024], f16)), {})
cnt: 1, ((T([2048, 50265], f16), T([50265, 1024], f16)), {})
cnt: 12, ((T([2048, 1024], f16), T([1024, 4096], f16)), {})
cnt: 12, ((T([1024, 2048], f16, stride=(1, 1024)), T([2048, 4096], f16)), {})
cnt: 12, ((T([2048, 4096], f16), T([4096, 1024], f16)), {})
cnt: 12, ((T([4096, 2048], f16, stride=(1, 4096)), T([2048, 1024], f16)), {})
cnt: 48, ((T([2048, 1024], f16), T([1024, 1024], f16)), {})
cnt: 48, ((T([1024, 2048], f16, stride=(1, 1024)), T([2048, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([16, 128, 1024], f16), 1.0), {})
cnt: 24, ((T([16, 128, 1024], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 26, ((T([16, 128, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 26, ((T([16, 128, 1024], f16), T([16, 128, 1024], f16), [1024], T([16, 128, 1], f32), T([16, 128, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([2048, 50265], f16), T([2048], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([2048, 50265], f16), T([2048], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 60, ((T([2048, 1024], f16), [0], True), {})
cnt: 12, ((T([2048, 4096], f16), [0], True), {})

View File

@ -0,0 +1,94 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([1024, 50265], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([1024, 50265], f16), T([1024, 50265], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 36, ((T([128, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 36, ((T([128, 128, 128], f16), T([128, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([8, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 108, ((T([8, 128, 16, 64], f16), [8, 128, 1024]), {})
cnt: 1, ((T([1024, 50265], f16), [8, 128, 50265]), {})
cnt: 36, ((T([8, 16, 128, 64], f16), [128, 128, 64]), {})
cnt: 36, ((T([8, 128, 1024], f16), [1024, 1024]), {})
Operator: aten.add.Tensor
cnt: 2, ((T([8, 128], i64, stride=(0, 1)), 2), {})
cnt: 193, ((T([8, 128, 1024], f16), T([8, 128, 1024], f16)), {})
cnt: 1, ((T([128], i64), 1), {})
cnt: 12, ((T([8, 16, 128, 128], f16), T([8, 1, 128, 128], f16)), {})
cnt: 1, ((T([8, 128, 50265], f16), T([1, 50265], f16)), {})
cnt: 2, ((T([50265, 1024], f16), T([50265, 1024], f16)), {})
Operator: aten.addmm.default
cnt: 144, ((T([1024], f16), T([1024, 1024], f16), T([1024, 1024], f16, stride=(1, 1024))), {})
cnt: 24, ((T([4096], f16), T([1024, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 24, ((T([1024], f16), T([1024, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
Operator: aten.any.default
cnt: 24, ((T([8, 128, 1024], b8),), {})
Operator: aten.bmm.default
cnt: 72, ((T([128, 128, 64], f16), T([128, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 72, ((T([128, 128, 128], f16), T([128, 128, 64], f16)), {})
cnt: 36, ((T([128, 128, 128], f16, stride=(16384, 1, 128)), T([128, 128, 64], f16)), {})
cnt: 36, ((T([128, 64, 128], f16, stride=(8192, 1, 64)), T([128, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 3, ((T([8, 128], i64),), {})
cnt: 1, ((T([8, 127], i64, stride=(128, 1)),), {})
Operator: aten.copy_.default
cnt: 2, ((T([8, 128], i64), T([8, 128], i64)), {})
cnt: 1, ((T([8, 127], i64, stride=(128, 1)), T([8, 127], i64)), {})
cnt: 1, ((T([8], i64, stride=(128,)), T([8], i64)), {})
Operator: aten.embedding.default
cnt: 2, ((T([50265, 1024], f16), T([8, 128], i64), 1), {})
cnt: 2, ((T([1026, 1024], f16), T([8, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 2, ((T([8, 128, 1024], f16), T([8, 128], i64), 1026, -1, False), {})
cnt: 2, ((T([8, 128, 1024], f16), T([8, 128], i64), 50265, 1, False), {})
Operator: aten.eq.Scalar
cnt: 1, ((T([8, 128], i64), -100), {})
Operator: aten.gather.default
cnt: 1, ((T([8, 128], i64), 1, T([8, 1], i64)), {})
Operator: aten.gelu.default
cnt: 24, ((T([8, 128, 4096], f16),), {})
Operator: aten.gelu_backward.default
cnt: 24, ((T([8, 128, 4096], f16), T([8, 128, 4096], f16)), {})
Operator: aten.isinf.default
cnt: 12, ((T([8, 128, 1024], f16),), {})
Operator: aten.isnan.default
cnt: 12, ((T([8, 128, 1024], f16),), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([8, 128], i64), T([8, 128], b8), 1), {})
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([1024, 1024], f16), T([1024, 50265], f16, stride=(1, 1024))), {})
cnt: 1, ((T([50265, 1024], f16, stride=(1, 50265)), T([1024, 1024], f16)), {})
cnt: 1, ((T([1024, 50265], f16), T([50265, 1024], f16)), {})
cnt: 24, ((T([1024, 1024], f16), T([1024, 4096], f16)), {})
cnt: 24, ((T([1024, 1024], f16, stride=(1, 1024)), T([1024, 4096], f16)), {})
cnt: 24, ((T([1024, 4096], f16), T([4096, 1024], f16)), {})
cnt: 24, ((T([4096, 1024], f16, stride=(1, 4096)), T([1024, 1024], f16)), {})
cnt: 144, ((T([1024, 1024], f16), T([1024, 1024], f16)), {})
cnt: 144, ((T([1024, 1024], f16, stride=(1, 1024)), T([1024, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 4, ((T([8, 128, 1024], f16), 1.0), {})
cnt: 72, ((T([8, 128, 1024], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 64, ((T([8, 128, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 64, ((T([8, 128, 1024], f16), T([8, 128, 1024], f16), [1024], T([8, 128, 1], f32), T([8, 128, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
Operator: aten.ne.Scalar
cnt: 1, ((T([8, 128], i64), 1), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([1024, 50265], f16), T([1024], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([1024, 50265], f16), T([1024], i64), None, 1, -100), {})
Operator: aten.sub.Tensor
cnt: 1, ((T([8], i64), 1), {})
Operator: aten.sum.SymInt
cnt: 168, ((T([1024, 1024], f16), [0], True), {})
cnt: 24, ((T([1024, 4096], f16), [0], True), {})
Operator: aten.sum.dim_IntList
cnt: 1, ((T([8, 128], b8), [1]), {})

View File

@ -0,0 +1,85 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([254, 29056], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([254, 29056], f16), T([254, 29056], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 24, ((T([2, 16, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 24, ((T([2, 16, 128, 128], f16), T([2, 16, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([2, 1, 1, 128], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 72, ((T([2, 16, 128, 64], f16), [32, 128, 64]), {})
cnt: 24, ((T([2, 16, 64, 128], f16), [32, 64, 128]), {})
cnt: 24, ((T([32, 128, 128], f16), [2, 16, 128, 128]), {})
cnt: 24, ((T([32, 128, 64], f16), [2, 16, 128, 64]), {})
cnt: 48, ((T([2, 128, 16, 64], f16), [2, 128, 1024]), {})
cnt: 24, ((T([2, 128, 1024], f16), [256, 1024]), {})
Operator: aten.add.Tensor
cnt: 145, ((T([2, 128, 1024], f16), T([2, 128, 1024], f16)), {})
cnt: 24, ((T([2, 16, 128, 128], f16), T([2, 1, 1, 128], f16)), {})
cnt: 1, ((T([29056, 1024], f16), T([29056, 1024], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([2, 128, 1024], f16), T([1, 128, 1024], f16)), {})
Operator: aten.addmm.default
cnt: 97, ((T([1024], f16), T([256, 1024], f16), T([1024, 1024], f16, stride=(1, 1024))), {})
cnt: 24, ((T([4096], f16), T([256, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 24, ((T([1024], f16), T([256, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
cnt: 1, ((T([29056], f16), T([256, 1024], f16), T([1024, 29056], f16, stride=(1, 1024))), {})
Operator: aten.bmm.default
cnt: 24, ((T([32, 128, 64], f16), T([32, 64, 128], f16)), {})
cnt: 24, ((T([32, 128, 128], f16), T([32, 128, 64], f16)), {})
cnt: 24, ((T([32, 128, 128], f16, stride=(16384, 1, 128)), T([32, 128, 64], f16)), {})
cnt: 24, ((T([32, 128, 64], f16), T([32, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 24, ((T([32, 64, 128], f16, stride=(8192, 1, 64)), T([32, 128, 128], f16)), {})
cnt: 24, ((T([32, 128, 128], f16), T([32, 128, 64], f16, stride=(8192, 1, 128))), {})
Operator: aten.clone.default
cnt: 2, ((T([2, 128], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([2, 128], i64), T([2, 128], i64)), {})
Operator: aten.div.Tensor
cnt: 48, ((T([2, 16, 128, 128], f16), 8.0), {})
Operator: aten.embedding.default
cnt: 1, ((T([29056, 1024], f16), T([2, 128], i64), 0), {})
cnt: 1, ((T([2, 1024], f16), T([2, 128], i64)), {})
cnt: 1, ((T([512, 1024], f16), T([1, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 128, 1024], f16), T([1, 128], i64), 512, -1, False), {})
cnt: 1, ((T([2, 128, 1024], f16), T([2, 128], i64), 2, -1, False), {})
cnt: 1, ((T([2, 128, 1024], f16), T([2, 128], i64), 29056, 0, False), {})
Operator: aten.gelu.default
cnt: 24, ((T([2, 128, 4096], f16),), {})
cnt: 1, ((T([2, 128, 1024], f16),), {})
Operator: aten.gelu_backward.default
cnt: 1, ((T([2, 128, 1024], f16), T([2, 128, 1024], f16)), {})
cnt: 24, ((T([2, 128, 4096], f16), T([2, 128, 4096], f16)), {})
Operator: aten.mm.default
cnt: 1, ((T([256, 29056], f16), T([29056, 1024], f16)), {})
cnt: 1, ((T([29056, 256], f16, stride=(1, 29056)), T([256, 1024], f16)), {})
cnt: 97, ((T([256, 1024], f16), T([1024, 1024], f16)), {})
cnt: 97, ((T([1024, 256], f16, stride=(1, 1024)), T([256, 1024], f16)), {})
cnt: 24, ((T([256, 1024], f16), T([1024, 4096], f16)), {})
cnt: 24, ((T([1024, 256], f16, stride=(1, 1024)), T([256, 4096], f16)), {})
cnt: 24, ((T([256, 4096], f16), T([4096, 1024], f16)), {})
cnt: 24, ((T([4096, 256], f16, stride=(1, 4096)), T([256, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([2, 1, 1, 128], f16), -65504.0), {})
Operator: aten.native_layer_norm.default
cnt: 50, ((T([2, 128, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 50, ((T([2, 128, 1024], f16), T([2, 128, 1024], f16), [1024], T([2, 128, 1], f32), T([2, 128, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([254, 29056], f16), T([254], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([254, 29056], f16), T([254], i64), None, 1, -100), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([2, 1, 1, 128], f16), 1.0), {})
Operator: aten.slice_backward.default
cnt: 1, ((T([2, 127, 29056], f16), [2, 127, 29056], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([2, 127, 29056], f16), [2, 128, 29056], 1, 0, -1, 1), {})
cnt: 1, ((T([2, 128, 29056], f16), [2, 128, 29056], 0, 0, 9223372036854775807, 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([256, 29056], f16), [0], True), {})
cnt: 121, ((T([256, 1024], f16), [0], True), {})
cnt: 24, ((T([256, 4096], f16), [0], True), {})
cnt: 1, ((T([2, 128, 1024], f16), [0], True), {})

View File

@ -0,0 +1,88 @@
Operator: aten._log_softmax.default
cnt: 2, ((T([8, 128], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 2, ((T([8, 128], f16), T([8, 128], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 24, ((T([8, 16, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 24, ((T([8, 16, 128, 128], f16), T([8, 16, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([8, 1, 1, 128], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 72, ((T([8, 16, 128, 64], f16), [128, 128, 64]), {})
cnt: 24, ((T([8, 16, 64, 128], f16), [128, 64, 128]), {})
cnt: 24, ((T([128, 128, 128], f16), [8, 16, 128, 128]), {})
cnt: 24, ((T([128, 128, 64], f16), [8, 16, 128, 64]), {})
cnt: 48, ((T([8, 128, 16, 64], f16), [8, 128, 1024]), {})
cnt: 24, ((T([8, 128, 1024], f16), [1024, 1024]), {})
Operator: aten.add.Tensor
cnt: 145, ((T([8, 128, 1024], f16), T([8, 128, 1024], f16)), {})
cnt: 24, ((T([8, 16, 128, 128], f16), T([8, 1, 1, 128], f16)), {})
cnt: 1, ((T([], f16), T([], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([8, 128, 1024], f16), T([1, 128, 1024], f16)), {})
Operator: aten.addmm.default
cnt: 96, ((T([1024], f16), T([1024, 1024], f16), T([1024, 1024], f16, stride=(1, 1024))), {})
cnt: 24, ((T([4096], f16), T([1024, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 24, ((T([1024], f16), T([1024, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
cnt: 1, ((T([2], f16), T([1024, 1024], f16), T([1024, 2], f16, stride=(1, 1024))), {})
Operator: aten.bmm.default
cnt: 24, ((T([128, 128, 64], f16), T([128, 64, 128], f16)), {})
cnt: 24, ((T([128, 128, 128], f16), T([128, 128, 64], f16)), {})
cnt: 24, ((T([128, 128, 128], f16, stride=(16384, 1, 128)), T([128, 128, 64], f16)), {})
cnt: 24, ((T([128, 128, 64], f16), T([128, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 24, ((T([128, 64, 128], f16, stride=(8192, 1, 64)), T([128, 128, 128], f16)), {})
cnt: 24, ((T([128, 128, 128], f16), T([128, 128, 64], f16, stride=(8192, 1, 128))), {})
Operator: aten.cat.default
cnt: 1, (([T([8, 128, 1], f16), T([8, 128, 1], f16)], 2), {})
Operator: aten.clamp.default
cnt: 2, ((T([8], i64), 0, 128), {})
Operator: aten.clone.default
cnt: 1, ((T([8, 128], i64),), {})
cnt: 2, ((T([8], i64),), {})
Operator: aten.copy_.default
cnt: 1, ((T([8, 128], i64), T([8, 128], i64)), {})
cnt: 2, ((T([8], i64), T([8], i64)), {})
Operator: aten.div.Tensor
cnt: 48, ((T([8, 16, 128, 128], f16), 8.0), {})
cnt: 2, ((T([], f16), 2), {})
Operator: aten.embedding.default
cnt: 1, ((T([29056, 1024], f16), T([8, 128], i64), 0), {})
cnt: 1, ((T([2, 1024], f16), T([8, 128], i64)), {})
cnt: 1, ((T([512, 1024], f16), T([1, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 128, 1024], f16), T([1, 128], i64), 512, -1, False), {})
cnt: 1, ((T([8, 128, 1024], f16), T([8, 128], i64), 2, -1, False), {})
cnt: 1, ((T([8, 128, 1024], f16), T([8, 128], i64), 29056, 0, False), {})
Operator: aten.gelu.default
cnt: 24, ((T([8, 128, 4096], f16),), {})
Operator: aten.gelu_backward.default
cnt: 24, ((T([8, 128, 4096], f16), T([8, 128, 4096], f16)), {})
Operator: aten.mm.default
cnt: 1, ((T([1024, 2], f16), T([2, 1024], f16)), {})
cnt: 1, ((T([2, 1024], f16, stride=(1, 2)), T([1024, 1024], f16)), {})
cnt: 24, ((T([1024, 1024], f16), T([1024, 4096], f16)), {})
cnt: 24, ((T([1024, 1024], f16, stride=(1, 1024)), T([1024, 4096], f16)), {})
cnt: 24, ((T([1024, 4096], f16), T([4096, 1024], f16)), {})
cnt: 24, ((T([4096, 1024], f16, stride=(1, 4096)), T([1024, 1024], f16)), {})
cnt: 96, ((T([1024, 1024], f16), T([1024, 1024], f16)), {})
cnt: 96, ((T([1024, 1024], f16, stride=(1, 1024)), T([1024, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([8, 1, 1, 128], f16), -65504.0), {})
Operator: aten.native_layer_norm.default
cnt: 49, ((T([8, 128, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 49, ((T([8, 128, 1024], f16), T([8, 128, 1024], f16), [1024], T([8, 128, 1], f32), T([8, 128, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 2, ((T([], f16), T([8, 128], f16), T([8], i64), None, 1, 128, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 2, ((T([8, 128], f16), T([8], i64), None, 1, 128), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([8, 1, 1, 128], f16), 1.0), {})
Operator: aten.split.Tensor
cnt: 1, ((T([8, 128, 2], f16), 1, -1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([1024, 2], f16), [0], True), {})
cnt: 120, ((T([1024, 1024], f16), [0], True), {})
cnt: 24, ((T([1024, 4096], f16), [0], True), {})
cnt: 1, ((T([8, 128, 1024], f16), [0], True), {})

View File

@ -0,0 +1,112 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([2048, 30522], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([2048, 30522], f16), T([2048, 30522], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 24, ((T([16, 4, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 24, ((T([16, 4, 128, 128], f16), T([16, 4, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([16, 1, 1, 128], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 72, ((T([16, 4, 128, 32], f16), [64, 128, 32]), {})
cnt: 24, ((T([16, 4, 32, 128], f16), [64, 32, 128]), {})
cnt: 24, ((T([64, 128, 128], f16), [16, 4, 128, 128]), {})
cnt: 24, ((T([64, 128, 32], f16), [16, 4, 128, 32]), {})
cnt: 1, ((T([2048, 30522], f16), [16, 128, 30522]), {})
cnt: 48, ((T([16, 128, 4, 32], f16), [16, 128, 128]), {})
cnt: 24, ((T([16, 128, 128], f16), [2048, 128]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([16, 128, 512], f16), T([1, 128, 512], f16)), {})
cnt: 97, ((T([16, 128, 512], f16), T([16, 128, 512], f16)), {})
cnt: 25, ((T([16, 128, 512], f16), T([512], f16)), {})
cnt: 168, ((T([16, 128, 128], f16), T([128], f16)), {})
cnt: 24, ((T([16, 4, 128, 128], f16), T([16, 1, 1, 128], f16)), {})
cnt: 241, ((T([16, 128, 128], f16), T([16, 128, 128], f16)), {})
cnt: 1, ((T([16, 128, 128], f16, stride=(49152, 384, 1)), T([16, 128, 128], f16)), {})
cnt: 1, ((T([30522, 128], f16, stride=(1, 30522)), T([30522, 128], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([16, 128, 30522], f16), T([30522], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([512], f16), T([2048, 384], f16), T([384, 512], f16, stride=(1, 384))), {})
cnt: 168, ((T([128], f16), T([2048, 512], f16), T([512, 128], f16, stride=(1, 512))), {})
cnt: 72, ((T([128], f16), T([2048, 128], f16), T([128, 128], f16, stride=(1, 128))), {})
cnt: 120, ((T([512], f16), T([2048, 128], f16), T([128, 512], f16, stride=(1, 128))), {})
cnt: 1, ((T([512], f16), T([2048, 512], f16), T([512, 512], f16, stride=(1, 512))), {})
Operator: aten.bmm.default
cnt: 24, ((T([64, 128, 32], f16), T([64, 32, 128], f16)), {})
cnt: 24, ((T([64, 128, 128], f16), T([64, 128, 32], f16)), {})
cnt: 24, ((T([64, 128, 128], f16, stride=(16384, 1, 128)), T([64, 128, 32], f16)), {})
cnt: 24, ((T([64, 128, 32], f16), T([64, 32, 128], f16, stride=(4096, 1, 32))), {})
cnt: 24, ((T([64, 32, 128], f16, stride=(4096, 1, 32)), T([64, 128, 128], f16)), {})
cnt: 24, ((T([64, 128, 128], f16), T([64, 128, 32], f16, stride=(4096, 1, 128))), {})
Operator: aten.cat.default
cnt: 1, (([T([16, 128, 128], f16), T([16, 128, 128], f16), T([16, 128, 128], f16)], 2), {})
cnt: 1, (([T([128, 30522], f16, stride=(1, 128)), T([384, 30522], f16)],), {})
Operator: aten.clone.default
cnt: 2, ((T([16, 128], i64),), {})
Operator: aten.constant_pad_nd.default
cnt: 1, ((T([16, 127, 128], f16, stride=(16384, 128, 1)), [0, 0, 0, 1, 0, 0], 0.0), {})
cnt: 1, ((T([16, 127, 128], f16, stride=(16384, 128, 1)), [0, 0, 1, 0, 0, 0], 0.0), {})
cnt: 1, ((T([16, 128, 128], f16, stride=(49152, 384, 1)), [0, 0, -1, 0, 0, 0]), {})
cnt: 1, ((T([16, 128, 128], f16, stride=(49152, 384, 1)), [0, 0, 0, -1, 0, 0]), {})
Operator: aten.copy_.default
cnt: 2, ((T([16, 128], i64), T([16, 128], i64)), {})
cnt: 1, ((T([30522, 128], f16), T([30522, 128], f16, stride=(1, 30522))), {})
Operator: aten.div.Tensor
cnt: 48, ((T([16, 4, 128, 128], f16), 5.656854249492381), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 128], f16), T([16, 128], i64), 0), {})
cnt: 1, ((T([512, 512], f16), T([1, 128], i64)), {})
cnt: 1, ((T([2, 512], f16), T([16, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([16, 128, 512], f16), T([16, 128], i64), 2, -1, False), {})
cnt: 1, ((T([1, 128, 512], f16), T([1, 128], i64), 512, -1, False), {})
cnt: 1, ((T([16, 128, 128], f16), T([16, 128], i64), 30522, 0, False), {})
Operator: aten.mm.default
cnt: 1, ((T([2048, 512], f16), T([512, 30522], f16)), {})
cnt: 1, ((T([512, 2048], f16, stride=(1, 512)), T([2048, 30522], f16)), {})
cnt: 1, ((T([2048, 30522], f16), T([30522, 512], f16, stride=(1, 30522))), {})
cnt: 1, ((T([2048, 512], f16), T([512, 512], f16)), {})
cnt: 1, ((T([512, 2048], f16, stride=(1, 512)), T([2048, 512], f16)), {})
cnt: 120, ((T([2048, 512], f16), T([512, 128], f16)), {})
cnt: 120, ((T([512, 2048], f16, stride=(1, 512)), T([2048, 128], f16)), {})
cnt: 168, ((T([2048, 128], f16), T([128, 512], f16)), {})
cnt: 168, ((T([128, 2048], f16, stride=(1, 128)), T([2048, 512], f16)), {})
cnt: 72, ((T([2048, 128], f16), T([128, 128], f16)), {})
cnt: 72, ((T([128, 2048], f16, stride=(1, 128)), T([2048, 128], f16)), {})
cnt: 1, ((T([2048, 512], f16), T([512, 384], f16)), {})
cnt: 1, ((T([512, 2048], f16, stride=(1, 512)), T([2048, 384], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([16, 1, 1, 128], f16), -65504.0), {})
cnt: 50, ((T([16, 128, 512], f16), T([512], f16)), {})
cnt: 336, ((T([16, 128, 128], f16), T([128], f16)), {})
cnt: 25, ((T([16, 128, 512], f16), T([16, 128, 512], f16)), {})
cnt: 168, ((T([16, 128, 128], f16), T([16, 128, 128], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 1, ((T([16, 128, 512], f16), [512], T([512], f16), T([512], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 1, ((T([16, 128, 512], f16), T([16, 128, 512], f16), [512], T([16, 128, 1], f32), T([16, 128, 1], f32), T([512], f16), T([512], f16), [True, True, True]), {})
Operator: aten.new_empty_strided.default
cnt: 1, ((T([30522, 128], f16, stride=(1, 30522)), [30522, 128], [128, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([2048, 30522], f16), T([2048], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([2048, 30522], f16), T([2048], i64), None, 1, -100), {})
Operator: aten.relu.default
cnt: 97, ((T([16, 128, 512], f16),), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([16, 1, 1, 128], f16), 1.0), {})
Operator: aten.slice_backward.default
cnt: 1, ((T([16, 127, 128], f16), [16, 128, 128], 1, 0, -1, 1), {})
cnt: 2, ((T([16, 128, 128], f16), [16, 128, 128], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([16, 127, 128], f16), [16, 128, 128], 1, 1, 9223372036854775807, 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([16, 128, 30522], f16), [0, 1], True), {})
cnt: 122, ((T([2048, 512], f16), [0], True), {})
cnt: 50, ((T([16, 128, 512], f16), [0, 1], True), {})
cnt: 336, ((T([16, 128, 128], f16), [0, 1], True), {})
cnt: 240, ((T([2048, 128], f16), [0], True), {})
cnt: 1, ((T([16, 128, 512], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 97, ((T([16, 128, 512], f16), T([16, 128, 512], f16), 0), {})

View File

@ -0,0 +1,106 @@
Operator: aten._log_softmax.default
cnt: 2, ((T([32, 128], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 2, ((T([32, 128], f16), T([32, 128], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 24, ((T([32, 4, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 24, ((T([32, 4, 128, 128], f16), T([32, 4, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([32, 1, 1, 128], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 72, ((T([32, 4, 128, 32], f16), [128, 128, 32]), {})
cnt: 24, ((T([32, 4, 32, 128], f16), [128, 32, 128]), {})
cnt: 24, ((T([128, 128, 128], f16), [32, 4, 128, 128]), {})
cnt: 24, ((T([128, 128, 32], f16), [32, 4, 128, 32]), {})
cnt: 48, ((T([32, 128, 4, 32], f16), [32, 128, 128]), {})
cnt: 24, ((T([32, 128, 128], f16), [4096, 128]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([32, 128, 512], f16), T([1, 128, 512], f16)), {})
cnt: 97, ((T([32, 128, 512], f16), T([32, 128, 512], f16)), {})
cnt: 25, ((T([32, 128, 512], f16), T([512], f16)), {})
cnt: 168, ((T([32, 128, 128], f16), T([128], f16)), {})
cnt: 24, ((T([32, 4, 128, 128], f16), T([32, 1, 1, 128], f16)), {})
cnt: 241, ((T([32, 128, 128], f16), T([32, 128, 128], f16)), {})
cnt: 1, ((T([], f16), T([], f16)), {})
cnt: 1, ((T([32, 128, 128], f16, stride=(49152, 384, 1)), T([32, 128, 128], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([512], f16), T([4096, 384], f16), T([384, 512], f16, stride=(1, 384))), {})
cnt: 168, ((T([128], f16), T([4096, 512], f16), T([512, 128], f16, stride=(1, 512))), {})
cnt: 72, ((T([128], f16), T([4096, 128], f16), T([128, 128], f16, stride=(1, 128))), {})
cnt: 120, ((T([512], f16), T([4096, 128], f16), T([128, 512], f16, stride=(1, 128))), {})
cnt: 1, ((T([2], f16), T([4096, 512], f16), T([512, 2], f16, stride=(1, 512))), {})
Operator: aten.bmm.default
cnt: 24, ((T([128, 128, 32], f16), T([128, 32, 128], f16)), {})
cnt: 24, ((T([128, 128, 128], f16), T([128, 128, 32], f16)), {})
cnt: 24, ((T([128, 128, 128], f16, stride=(16384, 1, 128)), T([128, 128, 32], f16)), {})
cnt: 24, ((T([128, 128, 32], f16), T([128, 32, 128], f16, stride=(4096, 1, 32))), {})
cnt: 24, ((T([128, 32, 128], f16, stride=(4096, 1, 32)), T([128, 128, 128], f16)), {})
cnt: 24, ((T([128, 128, 128], f16), T([128, 128, 32], f16, stride=(4096, 1, 128))), {})
Operator: aten.cat.default
cnt: 1, (([T([32, 128, 128], f16), T([32, 128, 128], f16), T([32, 128, 128], f16)], 2), {})
cnt: 1, (([T([32, 128, 1], f16), T([32, 128, 1], f16)], 2), {})
Operator: aten.clamp.default
cnt: 2, ((T([32], i64), 0, 128), {})
Operator: aten.clone.default
cnt: 1, ((T([32, 128], i64),), {})
cnt: 2, ((T([32], i64),), {})
Operator: aten.constant_pad_nd.default
cnt: 1, ((T([32, 127, 128], f16, stride=(16384, 128, 1)), [0, 0, 0, 1, 0, 0], 0.0), {})
cnt: 1, ((T([32, 127, 128], f16, stride=(16384, 128, 1)), [0, 0, 1, 0, 0, 0], 0.0), {})
cnt: 1, ((T([32, 128, 128], f16, stride=(49152, 384, 1)), [0, 0, -1, 0, 0, 0]), {})
cnt: 1, ((T([32, 128, 128], f16, stride=(49152, 384, 1)), [0, 0, 0, -1, 0, 0]), {})
Operator: aten.copy_.default
cnt: 1, ((T([32, 128], i64), T([32, 128], i64)), {})
cnt: 2, ((T([32], i64), T([32], i64)), {})
Operator: aten.div.Tensor
cnt: 48, ((T([32, 4, 128, 128], f16), 5.656854249492381), {})
cnt: 2, ((T([], f16), 2), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 128], f16), T([32, 128], i64), 0), {})
cnt: 1, ((T([512, 512], f16), T([1, 128], i64)), {})
cnt: 1, ((T([2, 512], f16), T([32, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([32, 128, 512], f16), T([32, 128], i64), 2, -1, False), {})
cnt: 1, ((T([1, 128, 512], f16), T([1, 128], i64), 512, -1, False), {})
cnt: 1, ((T([32, 128, 128], f16), T([32, 128], i64), 30522, 0, False), {})
Operator: aten.mm.default
cnt: 1, ((T([4096, 2], f16), T([2, 512], f16)), {})
cnt: 1, ((T([2, 4096], f16, stride=(1, 2)), T([4096, 512], f16)), {})
cnt: 120, ((T([4096, 512], f16), T([512, 128], f16)), {})
cnt: 120, ((T([512, 4096], f16, stride=(1, 512)), T([4096, 128], f16)), {})
cnt: 168, ((T([4096, 128], f16), T([128, 512], f16)), {})
cnt: 168, ((T([128, 4096], f16, stride=(1, 128)), T([4096, 512], f16)), {})
cnt: 72, ((T([4096, 128], f16), T([128, 128], f16)), {})
cnt: 72, ((T([128, 4096], f16, stride=(1, 128)), T([4096, 128], f16)), {})
cnt: 1, ((T([4096, 512], f16), T([512, 384], f16)), {})
cnt: 1, ((T([512, 4096], f16, stride=(1, 512)), T([4096, 384], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([32, 1, 1, 128], f16), -65504.0), {})
cnt: 50, ((T([32, 128, 512], f16), T([512], f16)), {})
cnt: 336, ((T([32, 128, 128], f16), T([128], f16)), {})
cnt: 25, ((T([32, 128, 512], f16), T([32, 128, 512], f16)), {})
cnt: 168, ((T([32, 128, 128], f16), T([32, 128, 128], f16)), {})
Operator: aten.nll_loss_backward.default
cnt: 2, ((T([], f16), T([32, 128], f16), T([32], i64), None, 1, 128, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 2, ((T([32, 128], f16), T([32], i64), None, 1, 128), {})
Operator: aten.relu.default
cnt: 96, ((T([32, 128, 512], f16),), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([32, 1, 1, 128], f16), 1.0), {})
Operator: aten.slice_backward.default
cnt: 1, ((T([32, 127, 128], f16), [32, 128, 128], 1, 0, -1, 1), {})
cnt: 2, ((T([32, 128, 128], f16), [32, 128, 128], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 127, 128], f16), [32, 128, 128], 1, 1, 9223372036854775807, 1), {})
Operator: aten.split.Tensor
cnt: 1, ((T([32, 128, 2], f16), 1, -1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([4096, 2], f16), [0], True), {})
cnt: 50, ((T([32, 128, 512], f16), [0, 1], True), {})
cnt: 121, ((T([4096, 512], f16), [0], True), {})
cnt: 336, ((T([32, 128, 128], f16), [0, 1], True), {})
cnt: 240, ((T([4096, 128], f16), [0], True), {})
cnt: 1, ((T([32, 128, 512], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 96, ((T([32, 128, 512], f16), T([32, 128, 512], f16), 0), {})

View File

@ -0,0 +1,103 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([508, 50272], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([508, 50272], f16), T([508, 50272], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([48, 128, 128], f16), -1, True), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([48, 128, 128], f32), T([48, 128, 128], f32), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([4, 128], b8),), {'dtype': i64})
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([4, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([4, 1, 128, 128], b8, stride=(128, 128, 0, 1)),), {'dtype': f16})
cnt: 1, ((T([4, 1, 128, 128], f16),), {'dtype': torch.bool})
cnt: 12, ((T([48, 128, 128], f32),), {'dtype': f16})
cnt: 12, ((T([48, 128, 128], f16),), {'dtype': f32, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 36, ((T([4, 128, 12, 64], f16), [4, 128, 768]), {})
cnt: 1, ((T([512, 50272], f16), [4, 128, 50272]), {})
cnt: 12, ((T([4, 12, 128, 64], f16), [48, 128, 64]), {})
cnt: 12, ((T([4, 128, 768], f16), [512, 768]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([4, 128], i64), 2), {})
cnt: 1, ((T([128], i64), 1), {})
cnt: 1, ((T([4, 1, 128, 128], f16), T([4, 1, 128, 128], f16)), {})
cnt: 49, ((T([4, 128, 768], f16), T([4, 128, 768], f16)), {})
cnt: 12, ((T([4, 12, 128, 128], f16), T([4, 1, 128, 128], f16)), {})
cnt: 24, ((T([512, 768], f16), T([512, 768], f16)), {})
cnt: 1, ((T([50272, 768], f16), T([50272, 768], f16)), {})
Operator: aten.addmm.default
cnt: 48, ((T([768], f16), T([512, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([512, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([512, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
Operator: aten.bmm.default
cnt: 24, ((T([48, 128, 64], f16), T([48, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 24, ((T([48, 128, 128], f16), T([48, 128, 64], f16)), {})
cnt: 12, ((T([48, 128, 128], f16, stride=(16384, 1, 128)), T([48, 128, 64], f16)), {})
cnt: 12, ((T([48, 64, 128], f16, stride=(8192, 1, 64)), T([48, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([4, 128], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([4, 128], i64), T([4, 128], i64)), {})
Operator: aten.cumsum.default
cnt: 1, ((T([4, 128], i64), 1), {})
Operator: aten.div.Scalar
cnt: 12, ((T([4, 12, 128, 128], f16), 2), {})
Operator: aten.embedding.default
cnt: 1, ((T([50272, 768], f16), T([4, 128], i64), 1), {})
cnt: 1, ((T([2050, 768], f16), T([4, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([4, 128, 768], f16), T([4, 128], i64), 2050, -1, False), {})
cnt: 1, ((T([4, 128, 768], f16), T([4, 128], i64), 50272, 1, False), {})
Operator: aten.eq.Tensor
cnt: 12, ((T([4, 12, 128, 128], f16), T([], f32)), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
cnt: 12, ((T([4, 12, 128, 128], f16), T([], f32)), {})
Operator: aten.masked_fill.Scalar
cnt: 1, ((T([4, 1, 128, 128], f16), T([4, 1, 128, 128], b8), -65504.0), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
cnt: 12, ((T([4, 12, 128, 128], f16), T([4, 12, 128, 128], b8), 0), {})
Operator: aten.maximum.default
cnt: 12, ((T([4, 12, 128, 128], f16), T([], f32)), {})
Operator: aten.mm.default
cnt: 1, ((T([512, 768], f16), T([768, 50272], f16, stride=(1, 768))), {})
cnt: 1, ((T([50272, 512], f16, stride=(1, 50272)), T([512, 768], f16)), {})
cnt: 1, ((T([512, 50272], f16), T([50272, 768], f16)), {})
cnt: 12, ((T([512, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 512], f16, stride=(1, 768)), T([512, 3072], f16)), {})
cnt: 12, ((T([512, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 512], f16, stride=(1, 3072)), T([512, 768], f16)), {})
cnt: 48, ((T([512, 768], f16), T([768, 768], f16)), {})
cnt: 48, ((T([768, 512], f16, stride=(1, 768)), T([512, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([4, 128], i64), T([4, 128], i64)), {})
cnt: 24, ((T([4, 128, 768], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 13, ((T([4, 128, 768], f16), [768], T([768], f16), T([768], f16), 1e-05), {})
cnt: 12, ((T([512, 768], f16), [768], T([768], f16), T([768], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 13, ((T([4, 128, 768], f16), T([4, 128, 768], f16), [768], T([4, 128, 1], f32), T([4, 128, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
cnt: 12, ((T([512, 768], f16), T([512, 768], f16), [768], T([512, 1], f32), T([512, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([508, 50272], f16), T([508], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([508, 50272], f16), T([508], i64), None, 1, -100), {})
Operator: aten.relu.default
cnt: 12, ((T([512, 3072], f16),), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([4, 1, 128, 128], f16), 1.0), {})
Operator: aten.slice_backward.default
cnt: 1, ((T([4, 127, 50272], f16), [4, 127, 50272], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([4, 127, 50272], f16), [4, 128, 50272], 1, 0, -1, 1), {})
Operator: aten.sub.Tensor
cnt: 1, ((T([4, 128], i64), 1), {})
Operator: aten.sum.SymInt
cnt: 60, ((T([512, 768], f16), [0], True), {})
cnt: 12, ((T([512, 3072], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 12, ((T([512, 3072], f16), T([512, 3072], f16), 0), {})
Operator: aten.where.self
cnt: 12, ((T([4, 12, 128, 128], b8), T([4, 12, 128, 128], f16), T([4, 12, 128, 128], f16)), {})

View File

@ -0,0 +1,73 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([2048, 50005], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([2048, 50005], f16), T([2048, 50005], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 6, ((T([192, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 6, ((T([192, 128, 128], f16), T([192, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([16, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 18, ((T([16, 128, 12, 64], f16), [16, 128, 768]), {})
cnt: 1, ((T([2048, 50005], f16), [16, 128, 50005]), {})
cnt: 6, ((T([16, 12, 128, 64], f16), [192, 128, 64]), {})
cnt: 6, ((T([16, 128, 768], f16), [2048, 768]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([128], i64), 1), {})
cnt: 1, ((T([16, 128], i64, stride=(0, 1)), 2), {})
cnt: 37, ((T([16, 128, 768], f16), T([16, 128, 768], f16)), {})
cnt: 6, ((T([16, 12, 128, 128], f16), T([16, 1, 128, 128], f16)), {})
cnt: 1, ((T([50005, 768], f16), T([50005, 768], f16)), {})
Operator: aten.addmm.default
cnt: 24, ((T([768], f16), T([2048, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 6, ((T([3072], f16), T([2048, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 6, ((T([768], f16), T([2048, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
Operator: aten.bmm.default
cnt: 12, ((T([192, 128, 64], f16), T([192, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 12, ((T([192, 128, 128], f16), T([192, 128, 64], f16)), {})
cnt: 6, ((T([192, 128, 128], f16, stride=(16384, 1, 128)), T([192, 128, 64], f16)), {})
cnt: 6, ((T([192, 64, 128], f16, stride=(8192, 1, 64)), T([192, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([16, 128], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([16, 128], i64), T([16, 128], i64)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50005, 768], f16), T([16, 128], i64), 1), {})
cnt: 1, ((T([1026, 768], f16), T([16, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([16, 128, 768], f16), T([16, 128], i64), 1026, -1, False), {})
cnt: 1, ((T([16, 128, 768], f16), T([16, 128], i64), 50005, 1, False), {})
Operator: aten.gelu.default
cnt: 6, ((T([16, 128, 3072], f16),), {})
Operator: aten.gelu_backward.default
cnt: 6, ((T([16, 128, 3072], f16), T([16, 128, 3072], f16)), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([2048, 768], f16), T([768, 50005], f16, stride=(1, 768))), {})
cnt: 1, ((T([50005, 2048], f16, stride=(1, 50005)), T([2048, 768], f16)), {})
cnt: 1, ((T([2048, 50005], f16), T([50005, 768], f16)), {})
cnt: 6, ((T([2048, 768], f16), T([768, 3072], f16)), {})
cnt: 6, ((T([768, 2048], f16, stride=(1, 768)), T([2048, 3072], f16)), {})
cnt: 6, ((T([2048, 3072], f16), T([3072, 768], f16)), {})
cnt: 6, ((T([3072, 2048], f16, stride=(1, 3072)), T([2048, 768], f16)), {})
cnt: 24, ((T([2048, 768], f16), T([768, 768], f16)), {})
cnt: 24, ((T([768, 2048], f16, stride=(1, 768)), T([2048, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([16, 128, 768], f16), 27.712812921102035), {})
cnt: 12, ((T([16, 128, 768], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 13, ((T([16, 128, 768], f16), [768], T([768], f16), T([768], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 13, ((T([16, 128, 768], f16), T([16, 128, 768], f16), [768], T([16, 128, 1], f32), T([16, 128, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([2048, 50005], f16), T([2048], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([2048, 50005], f16), T([2048], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 30, ((T([2048, 768], f16), [0], True), {})
cnt: 6, ((T([2048, 3072], f16), [0], True), {})

View File

@ -0,0 +1,94 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([1024, 50005], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([1024, 50005], f16), T([1024, 50005], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 18, ((T([96, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 18, ((T([96, 128, 128], f16), T([96, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([8, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 54, ((T([8, 128, 12, 64], f16), [8, 128, 768]), {})
cnt: 1, ((T([1024, 50005], f16), [8, 128, 50005]), {})
cnt: 18, ((T([8, 12, 128, 64], f16), [96, 128, 64]), {})
cnt: 18, ((T([8, 128, 768], f16), [1024, 768]), {})
Operator: aten.add.Tensor
cnt: 2, ((T([8, 128], i64, stride=(0, 1)), 2), {})
cnt: 97, ((T([8, 128, 768], f16), T([8, 128, 768], f16)), {})
cnt: 1, ((T([128], i64), 1), {})
cnt: 6, ((T([8, 12, 128, 128], f16), T([8, 1, 128, 128], f16)), {})
cnt: 1, ((T([8, 128, 50005], f16), T([1, 50005], f16)), {})
cnt: 2, ((T([50005, 768], f16), T([50005, 768], f16)), {})
Operator: aten.addmm.default
cnt: 72, ((T([768], f16), T([1024, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([1024, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([1024, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
Operator: aten.any.default
cnt: 12, ((T([8, 128, 768], b8),), {})
Operator: aten.bmm.default
cnt: 36, ((T([96, 128, 64], f16), T([96, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 36, ((T([96, 128, 128], f16), T([96, 128, 64], f16)), {})
cnt: 18, ((T([96, 128, 128], f16, stride=(16384, 1, 128)), T([96, 128, 64], f16)), {})
cnt: 18, ((T([96, 64, 128], f16, stride=(8192, 1, 64)), T([96, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 3, ((T([8, 128], i64),), {})
cnt: 1, ((T([8, 127], i64, stride=(128, 1)),), {})
Operator: aten.copy_.default
cnt: 2, ((T([8, 128], i64), T([8, 128], i64)), {})
cnt: 1, ((T([8, 127], i64, stride=(128, 1)), T([8, 127], i64)), {})
cnt: 1, ((T([8], i64, stride=(128,)), T([8], i64)), {})
Operator: aten.embedding.default
cnt: 2, ((T([50005, 768], f16), T([8, 128], i64), 1), {})
cnt: 2, ((T([1026, 768], f16), T([8, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 2, ((T([8, 128, 768], f16), T([8, 128], i64), 1026, -1, False), {})
cnt: 2, ((T([8, 128, 768], f16), T([8, 128], i64), 50005, 1, False), {})
Operator: aten.eq.Scalar
cnt: 1, ((T([8, 128], i64), -100), {})
Operator: aten.gather.default
cnt: 1, ((T([8, 128], i64), 1, T([8, 1], i64)), {})
Operator: aten.gelu.default
cnt: 12, ((T([8, 128, 3072], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([8, 128, 3072], f16), T([8, 128, 3072], f16)), {})
Operator: aten.isinf.default
cnt: 6, ((T([8, 128, 768], f16),), {})
Operator: aten.isnan.default
cnt: 6, ((T([8, 128, 768], f16),), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([8, 128], i64), T([8, 128], b8), 1), {})
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([1024, 768], f16), T([768, 50005], f16, stride=(1, 768))), {})
cnt: 1, ((T([50005, 1024], f16, stride=(1, 50005)), T([1024, 768], f16)), {})
cnt: 1, ((T([1024, 50005], f16), T([50005, 768], f16)), {})
cnt: 12, ((T([1024, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 1024], f16, stride=(1, 768)), T([1024, 3072], f16)), {})
cnt: 12, ((T([1024, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 1024], f16, stride=(1, 3072)), T([1024, 768], f16)), {})
cnt: 72, ((T([1024, 768], f16), T([768, 768], f16)), {})
cnt: 72, ((T([768, 1024], f16, stride=(1, 768)), T([1024, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 4, ((T([8, 128, 768], f16), 27.712812921102035), {})
cnt: 36, ((T([8, 128, 768], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 32, ((T([8, 128, 768], f16), [768], T([768], f16), T([768], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 32, ((T([8, 128, 768], f16), T([8, 128, 768], f16), [768], T([8, 128, 1], f32), T([8, 128, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.ne.Scalar
cnt: 1, ((T([8, 128], i64), 1), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([1024, 50005], f16), T([1024], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([1024, 50005], f16), T([1024], i64), None, 1, -100), {})
Operator: aten.sub.Tensor
cnt: 1, ((T([8], i64), 1), {})
Operator: aten.sum.SymInt
cnt: 84, ((T([1024, 768], f16), [0], True), {})
cnt: 12, ((T([1024, 3072], f16), [0], True), {})
Operator: aten.sum.dim_IntList
cnt: 1, ((T([8, 128], b8), [1]), {})

View File

@ -0,0 +1,72 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([1024, 50265], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([1024, 50265], f16), T([1024, 50265], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([128, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([128, 128, 128], f16), T([128, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([8, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 36, ((T([8, 128, 16, 64], f16), [8, 128, 1024]), {})
cnt: 1, ((T([1024, 50265], f16), [8, 128, 50265]), {})
cnt: 12, ((T([8, 16, 128, 64], f16), [128, 128, 64]), {})
cnt: 12, ((T([8, 128, 1024], f16), [1024, 1024]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([128], i64), 1), {})
cnt: 1, ((T([8, 128, 1024], f16), T([128, 1024], f16)), {})
cnt: 12, ((T([8, 16, 128, 128], f16), T([8, 1, 128, 128], f16)), {})
cnt: 72, ((T([8, 128, 1024], f16), T([8, 128, 1024], f16)), {})
cnt: 1, ((T([50265, 1024], f16), T([50265, 1024], f16)), {})
Operator: aten.addmm.default
cnt: 48, ((T([1024], f16), T([1024, 1024], f16), T([1024, 1024], f16, stride=(1, 1024))), {})
cnt: 12, ((T([4096], f16), T([1024, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 12, ((T([1024], f16), T([1024, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
Operator: aten.bmm.default
cnt: 24, ((T([128, 128, 64], f16), T([128, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 24, ((T([128, 128, 128], f16), T([128, 128, 64], f16)), {})
cnt: 12, ((T([128, 128, 128], f16, stride=(16384, 1, 128)), T([128, 128, 64], f16)), {})
cnt: 12, ((T([128, 64, 128], f16, stride=(8192, 1, 64)), T([128, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([8, 128], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([8, 128], i64), T([8, 128], i64)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50265, 1024], f16), T([8, 128], i64), 0), {})
cnt: 1, ((T([1024, 1024], f16), T([128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([8, 128, 1024], f16), T([8, 128], i64), 50265, 0, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([8, 128, 4096], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([8, 128, 4096], f16), T([8, 128, 4096], f16)), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([1024, 1024], f16), T([1024, 50265], f16, stride=(1, 1024))), {})
cnt: 1, ((T([50265, 1024], f16, stride=(1, 50265)), T([1024, 1024], f16)), {})
cnt: 1, ((T([1024, 50265], f16), T([50265, 1024], f16)), {})
cnt: 12, ((T([1024, 1024], f16), T([1024, 4096], f16)), {})
cnt: 12, ((T([1024, 1024], f16, stride=(1, 1024)), T([1024, 4096], f16)), {})
cnt: 12, ((T([1024, 4096], f16), T([4096, 1024], f16)), {})
cnt: 12, ((T([4096, 1024], f16, stride=(1, 4096)), T([1024, 1024], f16)), {})
cnt: 48, ((T([1024, 1024], f16), T([1024, 1024], f16)), {})
cnt: 48, ((T([1024, 1024], f16, stride=(1, 1024)), T([1024, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([8, 128, 1024], f16), 1.0), {})
cnt: 24, ((T([8, 128, 1024], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 25, ((T([8, 128, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 25, ((T([8, 128, 1024], f16), T([8, 128, 1024], f16), [1024], T([8, 128, 1], f32), T([8, 128, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([1024, 50265], f16), T([1024], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([1024, 50265], f16), T([1024], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 60, ((T([1024, 1024], f16), [0], True), {})
cnt: 12, ((T([1024, 4096], f16), [0], True), {})

View File

@ -0,0 +1,79 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([512, 50265], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([512, 50265], f16), T([512, 50265], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 36, ((T([64, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 36, ((T([64, 128, 128], f16), T([64, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([4, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 108, ((T([4, 128, 16, 64], f16), [4, 128, 1024]), {})
cnt: 1, ((T([512, 50265], f16), [4, 128, 50265]), {})
cnt: 36, ((T([4, 16, 128, 64], f16), [64, 128, 64]), {})
cnt: 36, ((T([4, 128, 1024], f16), [512, 1024]), {})
Operator: aten.add.Tensor
cnt: 2, ((T([4, 128, 1024], f16), T([128, 1024], f16)), {})
cnt: 191, ((T([4, 128, 1024], f16), T([4, 128, 1024], f16)), {})
cnt: 1, ((T([128], i64), 1), {})
cnt: 12, ((T([4, 16, 128, 128], f16), T([4, 1, 128, 128], f16)), {})
cnt: 1, ((T([4, 128, 50265], f16), T([1, 50265], f16)), {})
cnt: 2, ((T([50265, 1024], f16), T([50265, 1024], f16)), {})
Operator: aten.addmm.default
cnt: 144, ((T([1024], f16), T([512, 1024], f16), T([1024, 1024], f16, stride=(1, 1024))), {})
cnt: 24, ((T([4096], f16), T([512, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 24, ((T([1024], f16), T([512, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
Operator: aten.any.default
cnt: 24, ((T([4, 128, 1024], b8),), {})
Operator: aten.bmm.default
cnt: 72, ((T([64, 128, 64], f16), T([64, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 72, ((T([64, 128, 128], f16), T([64, 128, 64], f16)), {})
cnt: 36, ((T([64, 128, 128], f16, stride=(16384, 1, 128)), T([64, 128, 64], f16)), {})
cnt: 36, ((T([64, 64, 128], f16, stride=(8192, 1, 64)), T([64, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 3, ((T([4, 128], i64),), {})
Operator: aten.copy_.default
cnt: 3, ((T([4, 128], i64), T([4, 128], i64)), {})
Operator: aten.embedding.default
cnt: 2, ((T([50265, 1024], f16), T([4, 128], i64), 0), {})
cnt: 2, ((T([1024, 1024], f16), T([128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 2, ((T([4, 128, 1024], f16), T([4, 128], i64), 50265, 0, False), {})
Operator: aten.gelu.default
cnt: 24, ((T([4, 128, 4096], f16),), {})
Operator: aten.gelu_backward.default
cnt: 24, ((T([4, 128, 4096], f16), T([4, 128, 4096], f16)), {})
Operator: aten.isinf.default
cnt: 12, ((T([4, 128, 1024], f16),), {})
Operator: aten.isnan.default
cnt: 12, ((T([4, 128, 1024], f16),), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([512, 1024], f16), T([1024, 50265], f16, stride=(1, 1024))), {})
cnt: 1, ((T([50265, 512], f16, stride=(1, 50265)), T([512, 1024], f16)), {})
cnt: 1, ((T([512, 50265], f16), T([50265, 1024], f16)), {})
cnt: 24, ((T([512, 1024], f16), T([1024, 4096], f16)), {})
cnt: 24, ((T([1024, 512], f16, stride=(1, 1024)), T([512, 4096], f16)), {})
cnt: 24, ((T([512, 4096], f16), T([4096, 1024], f16)), {})
cnt: 24, ((T([4096, 512], f16, stride=(1, 4096)), T([512, 1024], f16)), {})
cnt: 144, ((T([512, 1024], f16), T([1024, 1024], f16)), {})
cnt: 144, ((T([1024, 512], f16, stride=(1, 1024)), T([512, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 4, ((T([4, 128, 1024], f16), 1.0), {})
cnt: 72, ((T([4, 128, 1024], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 62, ((T([4, 128, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 62, ((T([4, 128, 1024], f16), T([4, 128, 1024], f16), [1024], T([4, 128, 1], f32), T([4, 128, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([512, 50265], f16), T([512], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([512, 50265], f16), T([512], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 168, ((T([512, 1024], f16), [0], True), {})
cnt: 24, ((T([512, 4096], f16), [0], True), {})

View File

@ -0,0 +1,94 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([508, 30522], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([508, 30522], f16), T([508, 30522], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([4, 12, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([4, 12, 128, 128], f16), T([4, 12, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([4, 1, 1, 128], f32),), {'dtype': f16})
cnt: 1, ((T([4, 128], b8),), {'dtype': i32})
cnt: 1, ((T([4, 128], i64),), {'dtype': i32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([4, 128], i32),), {'dtype': i64})
Operator: aten._unsafe_view.default
cnt: 36, ((T([4, 12, 128, 64], f16), [48, 128, 64]), {})
cnt: 12, ((T([4, 12, 64, 128], f16), [48, 64, 128]), {})
cnt: 12, ((T([48, 128, 128], f16), [4, 12, 128, 128]), {})
cnt: 12, ((T([48, 128, 64], f16), [4, 12, 128, 64]), {})
cnt: 24, ((T([4, 128, 12, 64], f16), [4, 128, 768]), {})
cnt: 12, ((T([4, 128, 768], f16), [512, 768]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([4, 128], i32), 0), {})
cnt: 1, ((T([4, 128], i64), 0), {})
cnt: 73, ((T([4, 128, 768], f16), T([4, 128, 768], f16)), {})
cnt: 12, ((T([4, 12, 128, 128], f16), T([4, 1, 1, 128], f16)), {})
cnt: 1, ((T([30522, 768], f16), T([30522, 768], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([4, 128, 768], f16), T([4, 128, 768], f16)), {})
Operator: aten.addmm.default
cnt: 49, ((T([768], f16), T([512, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([512, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([512, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([30522], f16), T([512, 768], f16), T([768, 30522], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 12, ((T([48, 128, 64], f16), T([48, 64, 128], f16)), {})
cnt: 12, ((T([48, 128, 128], f16), T([48, 128, 64], f16)), {})
cnt: 12, ((T([48, 128, 128], f16, stride=(16384, 1, 128)), T([48, 128, 64], f16)), {})
cnt: 12, ((T([48, 128, 64], f16), T([48, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 12, ((T([48, 64, 128], f16, stride=(8192, 1, 64)), T([48, 128, 128], f16)), {})
cnt: 12, ((T([48, 128, 128], f16), T([48, 128, 64], f16, stride=(8192, 1, 128))), {})
Operator: aten.clone.default
cnt: 2, ((T([4, 128], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([4, 128], i64), T([4, 128], i64)), {})
Operator: aten.cumsum.default
cnt: 1, ((T([4, 128], i32), 1), {})
Operator: aten.div.Tensor
cnt: 24, ((T([4, 12, 128, 128], f16), 8.0), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 768], f16), T([4, 128], i64), 0), {})
cnt: 1, ((T([2, 768], f16), T([4, 128], i64, stride=(0, 1))), {})
cnt: 1, ((T([512, 768], f16), T([4, 128], i64), 0), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([4, 128, 768], f16), T([4, 128], i64), 512, 0, False), {})
cnt: 1, ((T([4, 128, 768], f16), T([4, 128], i64, stride=(0, 1)), 2, -1, False), {})
cnt: 1, ((T([4, 128, 768], f16), T([4, 128], i64), 30522, 0, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([4, 128, 3072], f16),), {})
cnt: 1, ((T([4, 128, 768], f16),), {})
Operator: aten.gelu_backward.default
cnt: 1, ((T([4, 128, 768], f16), T([4, 128, 768], f16)), {})
cnt: 12, ((T([4, 128, 3072], f16), T([4, 128, 3072], f16)), {})
Operator: aten.mm.default
cnt: 1, ((T([512, 30522], f16), T([30522, 768], f16)), {})
cnt: 1, ((T([30522, 512], f16, stride=(1, 30522)), T([512, 768], f16)), {})
cnt: 49, ((T([512, 768], f16), T([768, 768], f16)), {})
cnt: 49, ((T([768, 512], f16, stride=(1, 768)), T([512, 768], f16)), {})
cnt: 12, ((T([512, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 512], f16, stride=(1, 768)), T([512, 3072], f16)), {})
cnt: 12, ((T([512, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 512], f16, stride=(1, 3072)), T([512, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([4, 1, 1, 128], f16), -65504.0), {})
cnt: 1, ((T([4, 128], i32), T([4, 128], i32)), {})
Operator: aten.native_layer_norm.default
cnt: 26, ((T([4, 128, 768], f16), [768], T([768], f16), T([768], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 26, ((T([4, 128, 768], f16), T([4, 128, 768], f16), [768], T([4, 128, 1], f32), T([4, 128, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.ne.Scalar
cnt: 1, ((T([4, 128], i64), 0), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([508, 30522], f16), T([508], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([508, 30522], f16), T([508], i64), None, 1, -100), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([4, 1, 1, 128], f16), 1.0), {})
Operator: aten.slice_backward.default
cnt: 1, ((T([4, 127, 30522], f16), [4, 127, 30522], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([4, 127, 30522], f16), [4, 128, 30522], 1, 0, -1, 1), {})
cnt: 1, ((T([4, 128, 30522], f16), [4, 128, 30522], 0, 0, 9223372036854775807, 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([512, 30522], f16), [0], True), {})
cnt: 61, ((T([512, 768], f16), [0], True), {})
cnt: 12, ((T([512, 3072], f16), [0], True), {})

View File

@ -0,0 +1,97 @@
Operator: aten._log_softmax.default
cnt: 2, ((T([64, 128], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 2, ((T([64, 128], f16), T([64, 128], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([64, 12, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([64, 12, 128, 128], f16), T([64, 12, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([64, 1, 1, 128], f32),), {'dtype': f16})
cnt: 1, ((T([64, 128], b8),), {'dtype': i32})
cnt: 1, ((T([64, 128], i64),), {'dtype': i32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([64, 128], i32),), {'dtype': i64})
Operator: aten._unsafe_view.default
cnt: 36, ((T([64, 12, 128, 64], f16), [768, 128, 64]), {})
cnt: 12, ((T([64, 12, 64, 128], f16), [768, 64, 128]), {})
cnt: 12, ((T([768, 128, 128], f16), [64, 12, 128, 128]), {})
cnt: 12, ((T([768, 128, 64], f16), [64, 12, 128, 64]), {})
cnt: 24, ((T([64, 128, 12, 64], f16), [64, 128, 768]), {})
cnt: 12, ((T([64, 128, 768], f16), [8192, 768]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([64, 128], i32), 0), {})
cnt: 1, ((T([64, 128], i64), 0), {})
cnt: 73, ((T([64, 128, 768], f16), T([64, 128, 768], f16)), {})
cnt: 12, ((T([64, 12, 128, 128], f16), T([64, 1, 1, 128], f16)), {})
cnt: 1, ((T([], f16), T([], f16)), {})
Operator: aten.add_.Tensor
cnt: 1, ((T([64, 128, 768], f16), T([64, 128, 768], f16)), {})
Operator: aten.addmm.default
cnt: 48, ((T([768], f16), T([8192, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([8192, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([8192, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([2], f16), T([8192, 768], f16), T([768, 2], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 12, ((T([768, 128, 64], f16), T([768, 64, 128], f16)), {})
cnt: 12, ((T([768, 128, 128], f16), T([768, 128, 64], f16)), {})
cnt: 12, ((T([768, 128, 128], f16, stride=(16384, 1, 128)), T([768, 128, 64], f16)), {})
cnt: 12, ((T([768, 128, 64], f16), T([768, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 12, ((T([768, 64, 128], f16, stride=(8192, 1, 64)), T([768, 128, 128], f16)), {})
cnt: 12, ((T([768, 128, 128], f16), T([768, 128, 64], f16, stride=(8192, 1, 128))), {})
Operator: aten.cat.default
cnt: 1, (([T([64, 128, 1], f16), T([64, 128, 1], f16)], 2), {})
Operator: aten.clamp.default
cnt: 2, ((T([64], i64), 0, 128), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 128], i64),), {})
cnt: 2, ((T([64], i64),), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 128], i64), T([64, 128], i64)), {})
cnt: 2, ((T([64], i64), T([64], i64)), {})
Operator: aten.cumsum.default
cnt: 1, ((T([64, 128], i32), 1), {})
Operator: aten.div.Tensor
cnt: 24, ((T([64, 12, 128, 128], f16), 8.0), {})
cnt: 2, ((T([], f16), 2), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 768], f16), T([64, 128], i64), 0), {})
cnt: 1, ((T([2, 768], f16), T([64, 128], i64, stride=(0, 1))), {})
cnt: 1, ((T([512, 768], f16), T([64, 128], i64), 0), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([64, 128, 768], f16), T([64, 128], i64), 512, 0, False), {})
cnt: 1, ((T([64, 128, 768], f16), T([64, 128], i64, stride=(0, 1)), 2, -1, False), {})
cnt: 1, ((T([64, 128, 768], f16), T([64, 128], i64), 30522, 0, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([64, 128, 3072], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([64, 128, 3072], f16), T([64, 128, 3072], f16)), {})
Operator: aten.mm.default
cnt: 1, ((T([8192, 2], f16), T([2, 768], f16)), {})
cnt: 1, ((T([2, 8192], f16, stride=(1, 2)), T([8192, 768], f16)), {})
cnt: 12, ((T([8192, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 8192], f16, stride=(1, 768)), T([8192, 3072], f16)), {})
cnt: 12, ((T([8192, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 8192], f16, stride=(1, 3072)), T([8192, 768], f16)), {})
cnt: 48, ((T([8192, 768], f16), T([768, 768], f16)), {})
cnt: 48, ((T([768, 8192], f16, stride=(1, 768)), T([8192, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([64, 1, 1, 128], f16), -65504.0), {})
cnt: 1, ((T([64, 128], i32), T([64, 128], i32)), {})
Operator: aten.native_layer_norm.default
cnt: 25, ((T([64, 128, 768], f16), [768], T([768], f16), T([768], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 25, ((T([64, 128, 768], f16), T([64, 128, 768], f16), [768], T([64, 128, 1], f32), T([64, 128, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.ne.Scalar
cnt: 1, ((T([64, 128], i64), 0), {})
Operator: aten.nll_loss_backward.default
cnt: 2, ((T([], f16), T([64, 128], f16), T([64], i64), None, 1, 128, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 2, ((T([64, 128], f16), T([64], i64), None, 1, 128), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([64, 1, 1, 128], f16), 1.0), {})
Operator: aten.split.Tensor
cnt: 1, ((T([64, 128, 2], f16), 1, -1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([8192, 2], f16), [0], True), {})
cnt: 60, ((T([8192, 768], f16), [0], True), {})
cnt: 12, ((T([8192, 3072], f16), [0], True), {})

View File

@ -0,0 +1,82 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([8192, 10000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([8192, 10000], f16), T([8192, 10000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 6, ((T([256, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 6, ((T([256, 128, 128], f16), T([256, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([64, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([64, 128], b8),), {'dtype': i32})
cnt: 1, ((T([64, 128], i64),), {'dtype': i32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([64, 128], i32),), {'dtype': i64})
Operator: aten._unsafe_view.default
cnt: 18, ((T([64, 128, 4, 64], f16), [64, 128, 256]), {})
cnt: 1, ((T([8192, 10000], f16), [64, 128, 10000]), {})
cnt: 6, ((T([64, 4, 128, 64], f16), [256, 128, 64]), {})
cnt: 6, ((T([64, 128, 256], f16), [8192, 256]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([128], i64), 1), {})
cnt: 1, ((T([64, 128], i32), 0), {})
cnt: 1, ((T([64, 128], i64), 1), {})
cnt: 37, ((T([64, 128, 256], f16), T([64, 128, 256], f16)), {})
cnt: 6, ((T([64, 4, 128, 128], f16), T([64, 1, 128, 128], f16)), {})
cnt: 1, ((T([10000, 256], f16), T([10000, 256], f16)), {})
Operator: aten.addmm.default
cnt: 24, ((T([256], f16), T([8192, 256], f16), T([256, 256], f16, stride=(1, 256))), {})
cnt: 6, ((T([2048], f16), T([8192, 256], f16), T([256, 2048], f16, stride=(1, 256))), {})
cnt: 6, ((T([256], f16), T([8192, 2048], f16), T([2048, 256], f16, stride=(1, 2048))), {})
Operator: aten.bmm.default
cnt: 12, ((T([256, 128, 64], f16), T([256, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 12, ((T([256, 128, 128], f16), T([256, 128, 64], f16)), {})
cnt: 6, ((T([256, 128, 128], f16, stride=(16384, 1, 128)), T([256, 128, 64], f16)), {})
cnt: 6, ((T([256, 64, 128], f16, stride=(8192, 1, 64)), T([256, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([64, 128], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([64, 128], i64), T([64, 128], i64)), {})
Operator: aten.cumsum.default
cnt: 1, ((T([64, 128], i32), 1), {})
Operator: aten.embedding.default
cnt: 1, ((T([10000, 256], f16), T([64, 128], i64), 1), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([64, 128, 256], f16), T([64, 128], i64), 10000, 1, False), {})
Operator: aten.index_select.default
cnt: 1, ((T([1026, 256], f16), 0, T([8192], i64)), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([8192, 256], f16), T([256, 10000], f16, stride=(1, 256))), {})
cnt: 1, ((T([10000, 8192], f16, stride=(1, 10000)), T([8192, 256], f16)), {})
cnt: 1, ((T([8192, 10000], f16), T([10000, 256], f16)), {})
cnt: 6, ((T([8192, 256], f16), T([256, 2048], f16)), {})
cnt: 6, ((T([256, 8192], f16, stride=(1, 256)), T([8192, 2048], f16)), {})
cnt: 6, ((T([8192, 2048], f16), T([2048, 256], f16)), {})
cnt: 6, ((T([2048, 8192], f16, stride=(1, 2048)), T([8192, 256], f16)), {})
cnt: 24, ((T([8192, 256], f16), T([256, 256], f16)), {})
cnt: 24, ((T([256, 8192], f16, stride=(1, 256)), T([8192, 256], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([64, 128, 256], f16), 16.0), {})
cnt: 1, ((T([64, 128], i32), T([64, 128], i32)), {})
cnt: 12, ((T([64, 128, 256], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 12, ((T([64, 128, 256], f16), [256], T([256], f16), T([256], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 12, ((T([64, 128, 256], f16), T([64, 128, 256], f16), [256], T([64, 128, 1], f32), T([64, 128, 1], f32), T([256], f16), T([256], f16), [True, True, True]), {})
Operator: aten.ne.Scalar
cnt: 1, ((T([64, 128], i64), 1), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([8192, 10000], f16), T([8192], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([8192, 10000], f16), T([8192], i64), None, 1, -100), {})
Operator: aten.relu.default
cnt: 6, ((T([64, 128, 2048], f16),), {})
Operator: aten.sum.SymInt
cnt: 30, ((T([8192, 256], f16), [0], True), {})
cnt: 6, ((T([8192, 2048], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 6, ((T([64, 128, 2048], f16), T([64, 128, 2048], f16), 0), {})

View File

@ -0,0 +1,73 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([1024, 50265], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([1024, 50265], f16), T([1024, 50265], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([128, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([128, 128, 128], f16), T([128, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([8, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 36, ((T([8, 128, 16, 64], f16), [8, 128, 1024]), {})
cnt: 1, ((T([1024, 50265], f16), [8, 128, 50265]), {})
cnt: 12, ((T([8, 16, 128, 64], f16), [128, 128, 64]), {})
cnt: 12, ((T([8, 128, 1024], f16), [1024, 1024]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([8, 128], i64, stride=(0, 1)), 2), {})
cnt: 73, ((T([8, 128, 1024], f16), T([8, 128, 1024], f16)), {})
cnt: 1, ((T([128], i64), 1), {})
cnt: 12, ((T([8, 16, 128, 128], f16), T([8, 1, 128, 128], f16)), {})
cnt: 1, ((T([50265, 1024], f16), T([50265, 1024], f16)), {})
Operator: aten.addmm.default
cnt: 48, ((T([1024], f16), T([1024, 1024], f16), T([1024, 1024], f16, stride=(1, 1024))), {})
cnt: 12, ((T([4096], f16), T([1024, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 12, ((T([1024], f16), T([1024, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
Operator: aten.bmm.default
cnt: 24, ((T([128, 128, 64], f16), T([128, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 24, ((T([128, 128, 128], f16), T([128, 128, 64], f16)), {})
cnt: 12, ((T([128, 128, 128], f16, stride=(16384, 1, 128)), T([128, 128, 64], f16)), {})
cnt: 12, ((T([128, 64, 128], f16, stride=(8192, 1, 64)), T([128, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([8, 128], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([8, 128], i64), T([8, 128], i64)), {})
Operator: aten.embedding.default
cnt: 1, ((T([50265, 1024], f16), T([8, 128], i64), 1), {})
cnt: 1, ((T([514, 1024], f16), T([8, 128], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([8, 128, 1024], f16), T([8, 128], i64), 514, -1, False), {})
cnt: 1, ((T([8, 128, 1024], f16), T([8, 128], i64), 50265, 1, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([8, 128, 4096], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([8, 128, 4096], f16), T([8, 128, 4096], f16)), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([1024, 1024], f16), T([1024, 50265], f16, stride=(1, 1024))), {})
cnt: 1, ((T([50265, 1024], f16, stride=(1, 50265)), T([1024, 1024], f16)), {})
cnt: 1, ((T([1024, 50265], f16), T([50265, 1024], f16)), {})
cnt: 12, ((T([1024, 1024], f16), T([1024, 4096], f16)), {})
cnt: 12, ((T([1024, 1024], f16, stride=(1, 1024)), T([1024, 4096], f16)), {})
cnt: 12, ((T([1024, 4096], f16), T([4096, 1024], f16)), {})
cnt: 12, ((T([4096, 1024], f16, stride=(1, 4096)), T([1024, 1024], f16)), {})
cnt: 48, ((T([1024, 1024], f16), T([1024, 1024], f16)), {})
cnt: 48, ((T([1024, 1024], f16, stride=(1, 1024)), T([1024, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([8, 128, 1024], f16), 1.0), {})
cnt: 24, ((T([8, 128, 1024], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 25, ((T([8, 128, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 25, ((T([8, 128, 1024], f16), T([8, 128, 1024], f16), [1024], T([8, 128, 1], f32), T([8, 128, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([1024, 50265], f16), T([1024], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([1024, 50265], f16), T([1024], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 60, ((T([1024, 1024], f16), [0], True), {})
cnt: 12, ((T([1024, 4096], f16), [0], True), {})

View File

@ -0,0 +1,88 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([256, 256008], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([256, 256008], f16), T([256, 256008], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 24, ((T([32, 128, 128], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 24, ((T([32, 128, 128], f16), T([32, 128, 128], f16), -1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([128, 128], f32),), {'dtype': f16})
cnt: 1, ((T([2, 1, 128, 128], f16, stride=(0, 16384, 128, 1)),), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([2, 128], b8),), {'dtype': i32})
cnt: 1, ((T([2, 128], i64),), {'dtype': i32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([2, 128], i32),), {'dtype': i64})
Operator: aten._unsafe_view.default
cnt: 72, ((T([2, 128, 16, 64], f16), [2, 128, 1024]), {})
cnt: 1, ((T([256, 256008], f16), [2, 128, 256008]), {})
cnt: 24, ((T([2, 16, 128, 64], f16), [32, 128, 64]), {})
cnt: 24, ((T([2, 128, 1024], f16), [256, 1024]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([128], i64), 1), {})
cnt: 1, ((T([2, 128], i32), 0), {})
cnt: 1, ((T([2, 128], i64), 1), {})
cnt: 145, ((T([2, 128, 1024], f16), T([2, 128, 1024], f16)), {})
cnt: 24, ((T([2, 16, 128, 128], f16), T([2, 1, 128, 128], f16)), {})
cnt: 1, ((T([256008, 1024], f16), T([256008, 1024], f16)), {})
Operator: aten.addmm.default
cnt: 96, ((T([1024], f16), T([256, 1024], f16), T([1024, 1024], f16, stride=(1, 1024))), {})
cnt: 24, ((T([4096], f16), T([256, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 24, ((T([1024], f16), T([256, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
Operator: aten.bmm.default
cnt: 48, ((T([32, 128, 64], f16), T([32, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 48, ((T([32, 128, 128], f16), T([32, 128, 64], f16)), {})
cnt: 24, ((T([32, 128, 128], f16, stride=(16384, 1, 128)), T([32, 128, 64], f16)), {})
cnt: 24, ((T([32, 64, 128], f16, stride=(8192, 1, 64)), T([32, 128, 128], f16)), {})
Operator: aten.clone.default
cnt: 2, ((T([2, 128], i64),), {})
cnt: 1, ((T([2, 127], i64, stride=(128, 1)),), {})
Operator: aten.copy_.default
cnt: 2, ((T([2, 128], i64), T([2, 128], i64)), {})
cnt: 1, ((T([2, 127], i64, stride=(128, 1)), T([2, 127], i64)), {})
Operator: aten.cumsum.default
cnt: 1, ((T([2, 128], i32), 1), {})
Operator: aten.embedding.default
cnt: 1, ((T([256008, 1024], f16), T([2, 128], i64), 1), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([2, 128, 1024], f16), T([2, 128], i64), 256008, 1, False), {})
Operator: aten.fill_.Tensor
cnt: 1, ((T([2], i64, stride=(128,)), T([], i64)), {})
Operator: aten.gelu.default
cnt: 24, ((T([2, 128, 4096], f16),), {})
Operator: aten.gelu_backward.default
cnt: 24, ((T([2, 128, 4096], f16), T([2, 128, 4096], f16)), {})
Operator: aten.index_select.default
cnt: 1, ((T([2050, 1024], f16), 0, T([256], i64)), {})
Operator: aten.lt.Tensor
cnt: 1, ((T([128], i64), T([128, 1], i64)), {})
Operator: aten.masked_fill_.Scalar
cnt: 1, ((T([128, 128], f32), T([128, 128], b8), 0), {})
Operator: aten.mm.default
cnt: 1, ((T([256, 1024], f16), T([1024, 256008], f16, stride=(1, 1024))), {})
cnt: 1, ((T([256008, 256], f16, stride=(1, 256008)), T([256, 1024], f16)), {})
cnt: 1, ((T([256, 256008], f16), T([256008, 1024], f16)), {})
cnt: 24, ((T([256, 1024], f16), T([1024, 4096], f16)), {})
cnt: 24, ((T([1024, 256], f16, stride=(1, 1024)), T([256, 4096], f16)), {})
cnt: 24, ((T([256, 4096], f16), T([4096, 1024], f16)), {})
cnt: 24, ((T([4096, 256], f16, stride=(1, 4096)), T([256, 1024], f16)), {})
cnt: 96, ((T([256, 1024], f16), T([1024, 1024], f16)), {})
cnt: 96, ((T([1024, 256], f16, stride=(1, 1024)), T([256, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([2, 128, 1024], f16), 32.0), {})
cnt: 1, ((T([2, 128], i32), T([2, 128], i32)), {})
cnt: 48, ((T([2, 128, 1024], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 49, ((T([2, 128, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 49, ((T([2, 128, 1024], f16), T([2, 128, 1024], f16), [1024], T([2, 128, 1], f32), T([2, 128, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
Operator: aten.ne.Scalar
cnt: 1, ((T([2, 128], i64), 1), {})
Operator: aten.new_zeros.default
cnt: 1, ((T([2, 128], i64), [2, 128]), {'dtype': i64, 'layout': torch.strided, 'device': 'cuda', 'pin_memory': False})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([256, 256008], f16), T([256], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([256, 256008], f16), T([256], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 120, ((T([256, 1024], f16), [0], True), {})
cnt: 24, ((T([256, 4096], f16), [0], True), {})

View File

@ -0,0 +1,105 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([2048, 32000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([2048, 32000], f16), T([2048, 32000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 24, ((T([4, 16, 512, 512], f16), 3, False), {})
Operator: aten._softmax_backward_data.default
cnt: 24, ((T([4, 16, 512, 512], f16), T([4, 16, 512, 512], f16), 3, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([1024, 4, 1024], f32, stride=(1024, 0, 1)),), {'dtype': f32, 'layout': torch.strided, 'device': 'cuda'})
cnt: 24, ((T([1024, 4, 1024], f32),), {'dtype': f16, 'device': 'cuda'})
Operator: aten._unsafe_view.default
cnt: 24, ((T([512, 4, 64, 16, 1], f16), [1, 2048, 1024]), {})
cnt: 24, ((T([64, 16, 1024, 1, 1], f16), [1, 1024, 1024]), {})
cnt: 24, ((T([4, 16, 512, 1, 64], f16), [64, 512, 64]), {})
cnt: 24, ((T([1024, 4, 1, 16, 64], f16), [1, 4096, 1024]), {})
cnt: 72, ((T([512, 4, 1, 16, 64], f16), [1, 2048, 1024]), {})
Operator: aten.add.Tensor
cnt: 48, ((T([512, 4, 16, 64], f16), T([16, 64], f16)), {})
cnt: 24, ((T([4, 16, 512, 512], f16), T([4, 16, 512, 512], f16)), {})
cnt: 24, ((T([4, 16, 512, 512], f16), 0), {})
cnt: 144, ((T([512, 4, 1024], f16), T([512, 4, 1024], f16)), {})
cnt: 24, ((T([512, 4, 16, 64], f16, stride=(64, 524288, 32768, 1)), T([512, 4, 16, 64], f16, stride=(64, 524288, 32768, 1))), {})
cnt: 1, ((T([32000, 1024], f16), T([32000, 1024], f16)), {})
Operator: aten.addmm.default
cnt: 24, ((T([4096], f16), T([2048, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 24, ((T([1024], f16), T([2048, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
cnt: 1, ((T([32000], f16), T([2048, 1024], f16), T([1024, 32000], f16, stride=(1, 1024))), {})
Operator: aten.bmm.default
cnt: 96, ((T([1, 2048, 1024], f16), T([1, 1024, 1024], f16)), {})
cnt: 24, ((T([1, 4096, 1024], f16), T([1, 1024, 1024], f16)), {})
cnt: 24, ((T([64, 512, 64], f16, stride=(64, 4096, 1)), T([64, 64, 512], f16, stride=(64, 1, 4096))), {})
cnt: 24, ((T([64, 512, 64], f16, stride=(64, 4096, 1)), T([64, 64, 1024], f16, stride=(64, 1, 4096))), {})
cnt: 48, ((T([64, 512, 512], f16), T([64, 512, 64], f16, stride=(64, 4096, 1))), {})
cnt: 96, ((T([1, 1024, 2048], f16, stride=(2097152, 1, 1024)), T([1, 2048, 1024], f16)), {})
cnt: 96, ((T([1, 2048, 1024], f16), T([1, 1024, 1024], f16, stride=(1048576, 1, 1024))), {})
cnt: 24, ((T([64, 512, 512], f16, stride=(262144, 1, 512)), T([64, 512, 64], f16)), {})
cnt: 24, ((T([64, 512, 64], f16), T([64, 64, 512], f16, stride=(64, 1, 4096))), {})
cnt: 24, ((T([64, 64, 512], f16, stride=(64, 1, 4096)), T([64, 512, 1024], f16)), {})
cnt: 24, ((T([64, 512, 1024], f16), T([64, 1024, 64], f16, stride=(64, 4096, 1))), {})
cnt: 24, ((T([64, 64, 512], f16, stride=(64, 1, 4096)), T([64, 512, 512], f16)), {})
cnt: 24, ((T([1, 1024, 4096], f16, stride=(4194304, 1, 1024)), T([1, 4096, 1024], f16)), {})
Operator: aten.cat.default
cnt: 1, (([T([1024, 512], f32), T([1024, 512], f32)], -1), {})
Operator: aten.clone.default
cnt: 2, ((T([4, 512], i64),), {})
Operator: aten.copy_.default
cnt: 2, ((T([4, 512], i64), T([4, 512], i64)), {})
cnt: 24, ((T([1024, 16, 64], f16), T([1024, 16, 64], f16, stride=(1, 1024, 16384))), {})
Operator: aten.cos.default
cnt: 1, ((T([1024, 512], f32),), {})
Operator: aten.div.Tensor
cnt: 1, ((T([512], f32), 1024), {})
Operator: aten.embedding.default
cnt: 1, ((T([32000, 1024], f16), T([512, 4], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([512, 4, 1024], f16), T([512, 4], i64), 32000, -1, False), {})
Operator: aten.gelu.default
cnt: 24, ((T([512, 4, 4096], f16),), {})
Operator: aten.gelu_backward.default
cnt: 24, ((T([512, 4, 4096], f16), T([512, 4, 4096], f16)), {})
Operator: aten.index_add.default
cnt: 24, ((T([4, 16, 512, 1023], f16), 3, T([512], i64), T([4, 16, 512, 512], f16)), {})
Operator: aten.index_select.default
cnt: 24, ((T([4, 16, 512, 1023], f16, stride=(8388608, 524288, 1023, 1)), 3, T([512], i64)), {})
Operator: aten.mm.default
cnt: 1, ((T([2048, 32000], f16), T([32000, 1024], f16)), {})
cnt: 1, ((T([32000, 2048], f16, stride=(1, 32000)), T([2048, 1024], f16)), {})
cnt: 24, ((T([2048, 1024], f16), T([1024, 4096], f16)), {})
cnt: 24, ((T([1024, 2048], f16, stride=(1, 1024)), T([2048, 4096], f16)), {})
cnt: 24, ((T([2048, 4096], f16), T([4096, 1024], f16)), {})
cnt: 24, ((T([4096, 2048], f16, stride=(1, 4096)), T([2048, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([512], f32), 1), {})
cnt: 1, ((T([1024, 1], f32), T([1, 512], f32)), {})
cnt: 48, ((T([4, 16, 512, 512], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 48, ((T([512, 4, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 1, ((T([512, 4, 1024], f16, stride=(1024, 524288, 1)), T([512, 4, 1024], f16), [1024], T([512, 4, 1], f32), T([512, 4, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
cnt: 47, ((T([512, 4, 1024], f16), T([512, 4, 1024], f16), [1024], T([512, 4, 1], f32), T([512, 4, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
Operator: aten.new_empty_strided.default
cnt: 24, ((T([1024, 16, 64], f16, stride=(1, 1024, 16384)), [1024, 16, 64], [1024, 64, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.new_zeros.default
cnt: 24, ((T([4, 16, 512, 512], f16), [4, 16, 512, 1023]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([2048, 32000], f16), T([2048], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([2048, 32000], f16), T([2048], i64), None, 1, -100), {})
Operator: aten.pow.Scalar
cnt: 1, ((10000, T([512], f32)), {})
Operator: aten.reciprocal.default
cnt: 1, ((T([512], f32),), {})
Operator: aten.sin.default
cnt: 1, ((T([1024, 512], f32),), {})
Operator: aten.slice_backward.default
cnt: 24, ((T([4, 16, 1023, 512], f16), [4, 16, 1023, 512], 3, 0, 9223372036854775807, 1), {})
cnt: 24, ((T([4, 16, 1023, 512], f16), [4, 16, 1024, 512], 2, 1, 9223372036854775807, 1), {})
cnt: 24, ((T([4, 16, 1024, 512], f16), [4, 16, 1024, 512], 1, 0, 9223372036854775807, 1), {})
cnt: 24, ((T([4, 16, 1024, 512], f16), [4, 16, 1024, 512], 0, 0, 9223372036854775807, 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([2048, 32000], f16), [0], True), {})
cnt: 24, ((T([2048, 1024], f16), [0], True), {})
cnt: 24, ((T([2048, 4096], f16), [0], True), {})
cnt: 48, ((T([512, 4, 16, 64], f16, stride=(64, 524288, 32768, 1)), [0, 1], True), {})

View File

@ -0,0 +1,119 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([512, 30522], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([512, 30522], f16), T([512, 30522], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([3072, 9, 1], f16), 1, False), {})
cnt: 12, ((T([1, 6, 512, 512], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([1, 6, 512, 512], f16), T([1, 6, 512, 512], f16), -1, f16), {})
cnt: 12, ((T([3072, 9, 1], f16), T([3072, 9, 1], f16), 1, f16), {})
Operator: aten._to_copy.default
cnt: 1, ((T([1, 1, 1, 512], f32),), {'dtype': f16})
Operator: aten._unsafe_view.default
cnt: 12, ((T([1, 512, 54], f16), [1, 512, 54]), {})
cnt: 12, ((T([1, 512, 384, 9], f16), [3072, 64, 9]), {})
cnt: 12, ((T([3072, 64, 1], f16), [3072, 64, 1]), {})
cnt: 12, ((T([6, 512, 512], f16), [1, 6, 512, 512]), {})
cnt: 12, ((T([6, 512, 64], f16), [1, 6, 512, 64]), {})
cnt: 12, ((T([512, 384], f16), [3072, 64, 1]), {})
cnt: 24, ((T([1, 512, 6, 64], f16), [1, 512, 384]), {})
Operator: aten.add.Tensor
cnt: 86, ((T([1, 512, 768], f16), T([1, 512, 768], f16)), {})
cnt: 12, ((T([1, 512, 54], f16), T([54], f16)), {})
cnt: 12, ((T([1, 6, 512, 512], f16), T([1, 1, 1, 512], f16)), {})
cnt: 12, ((T([1, 512, 384], f16), T([1, 512, 384], f16)), {})
cnt: 12, ((T([1, 512, 768], f16), T([1, 512, 768], f16, stride=(393216, 1, 512))), {})
cnt: 1, ((T([30522, 768], f16), T([30522, 768], f16)), {})
Operator: aten.add_.Tensor
cnt: 12, ((T([1, 384, 512], f16), T([384, 1], f16)), {})
Operator: aten.addmm.default
cnt: 48, ((T([384], f16), T([512, 768], f16), T([768, 384], f16, stride=(1, 768))), {})
cnt: 13, ((T([768], f16), T([512, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([512, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([512, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([30522], f16), T([512, 768], f16), T([768, 30522], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 12, ((T([1, 512, 384], f16, stride=(512, 1, 512)), T([1, 384, 54], f16, stride=(384, 1, 384))), {})
cnt: 12, ((T([3072, 64, 9], f16), T([3072, 9, 1], f16)), {})
cnt: 12, ((T([6, 512, 64], f16, stride=(64, 384, 1)), T([6, 64, 512], f16, stride=(64, 1, 384))), {})
cnt: 24, ((T([6, 512, 512], f16), T([6, 512, 64], f16, stride=(64, 384, 1))), {})
cnt: 12, ((T([6, 512, 512], f16, stride=(262144, 1, 512)), T([6, 512, 64], f16, stride=(64, 768, 1))), {})
cnt: 12, ((T([6, 512, 64], f16, stride=(64, 768, 1)), T([6, 64, 512], f16, stride=(64, 1, 384))), {})
cnt: 12, ((T([6, 64, 512], f16, stride=(64, 1, 384)), T([6, 512, 512], f16)), {})
cnt: 12, ((T([3072, 9, 64], f16, stride=(576, 1, 9)), T([3072, 64, 1], f16)), {})
cnt: 12, ((T([3072, 64, 1], f16), T([3072, 1, 9], f16)), {})
cnt: 12, ((T([1, 384, 512], f16), T([1, 512, 54], f16)), {})
cnt: 12, ((T([1, 512, 54], f16), T([1, 54, 384], f16)), {})
Operator: aten.cat.default
cnt: 12, (([T([1, 512, 6, 64], f16), T([1, 512, 6, 64], f16)], 2), {})
Operator: aten.clone.default
cnt: 2, ((T([1, 512], i64),), {})
Operator: aten.convolution.default
cnt: 12, ((T([1, 768, 512], f16, stride=(393216, 1, 768)), T([768, 1, 9], f16), None, [1], [4], [1], False, [0], 768), {})
cnt: 12, ((T([1, 768, 512], f16), T([384, 768, 1], f16), None, [1], [0], [1], False, [0], 1), {})
Operator: aten.convolution_backward.default
cnt: 12, ((T([1, 384, 512], f16, stride=(196608, 1, 384)), T([1, 768, 512], f16), T([384, 768, 1], f16), [0], [1], [0], [1], False, [0], 1, [True, True, False]), {})
cnt: 12, ((T([1, 768, 512], f16), T([1, 768, 512], f16, stride=(393216, 1, 768)), T([768, 1, 9], f16), [0], [1], [4], [1], False, [0], 768, [True, True, False]), {})
Operator: aten.copy_.default
cnt: 2, ((T([1, 512], i64), T([1, 512], i64)), {})
cnt: 12, ((T([54, 384], f16), T([54, 384], f16, stride=(1, 54))), {})
Operator: aten.div.Tensor
cnt: 24, ((T([1, 6, 512, 512], f16), 8.0), {})
Operator: aten.embedding.default
cnt: 1, ((T([30522, 768], f16), T([1, 512], i64), 0), {})
cnt: 1, ((T([512, 768], f16), T([1, 512], i64)), {})
cnt: 1, ((T([2, 768], f16), T([1, 512], i64)), {})
Operator: aten.embedding_dense_backward.default
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 2, -1, False), {})
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 512, -1, False), {})
cnt: 1, ((T([1, 512, 768], f16), T([1, 512], i64), 30522, 0, False), {})
Operator: aten.gelu.default
cnt: 12, ((T([1, 512, 3072], f16),), {})
cnt: 1, ((T([1, 512, 768], f16),), {})
Operator: aten.gelu_backward.default
cnt: 1, ((T([1, 512, 768], f16), T([1, 512, 768], f16)), {})
cnt: 12, ((T([1, 512, 3072], f16), T([1, 512, 3072], f16)), {})
Operator: aten.im2col.default
cnt: 12, ((T([1, 384, 512, 1], f16), [9, 1], [1, 1], [4, 0], [1, 1]), {})
Operator: aten.im2col_backward.default
cnt: 12, ((T([1, 3456, 512], f16, stride=(1769472, 1, 3456)), [512, 1], [9, 1], [1, 1], [4, 0], [1, 1]), {})
Operator: aten.mm.default
cnt: 1, ((T([512, 30522], f16), T([30522, 768], f16)), {})
cnt: 1, ((T([30522, 512], f16, stride=(1, 30522)), T([512, 768], f16)), {})
cnt: 13, ((T([512, 768], f16), T([768, 768], f16)), {})
cnt: 13, ((T([768, 512], f16, stride=(1, 768)), T([512, 768], f16)), {})
cnt: 12, ((T([512, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 512], f16, stride=(1, 768)), T([512, 3072], f16)), {})
cnt: 12, ((T([512, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 512], f16, stride=(1, 3072)), T([512, 768], f16)), {})
cnt: 24, ((T([512, 384], f16, stride=(1, 512)), T([384, 768], f16)), {})
cnt: 24, ((T([384, 512], f16), T([512, 768], f16)), {})
cnt: 24, ((T([512, 384], f16), T([384, 768], f16)), {})
cnt: 24, ((T([384, 512], f16, stride=(1, 384)), T([512, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 1, ((T([1, 1, 1, 512], f16), -65504.0), {})
cnt: 12, ((T([1, 512, 384], f16, stride=(196608, 1, 512)), T([1, 512, 384], f16)), {})
cnt: 12, ((T([1, 512, 384], f16), T([1, 512, 384], f16, stride=(196608, 1, 512))), {})
cnt: 12, ((T([1, 512, 384], f16), T([1, 512, 384], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 26, ((T([1, 512, 768], f16), [768], T([768], f16), T([768], f16), 1e-12), {})
Operator: aten.native_layer_norm_backward.default
cnt: 26, ((T([1, 512, 768], f16), T([1, 512, 768], f16), [768], T([1, 512, 1], f32), T([1, 512, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.new_empty_strided.default
cnt: 12, ((T([54, 384], f16, stride=(1, 54)), [54, 384], [384, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([512, 30522], f16), T([512], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([512, 30522], f16), T([512], i64), None, 1, -100), {})
Operator: aten.rsub.Scalar
cnt: 1, ((T([1, 1, 1, 512], f16), 1.0), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([512, 30522], f16), [0], True), {})
cnt: 25, ((T([512, 768], f16), [0], True), {})
cnt: 12, ((T([512, 3072], f16), [0], True), {})
cnt: 24, ((T([512, 384], f16, stride=(1, 512)), [0], True), {})
cnt: 12, ((T([1, 512, 54], f16), [0, 1], True), {})
cnt: 12, ((T([1, 384, 54], f16), [0], True), {})
cnt: 12, ((T([1, 384, 512], f16, stride=(196608, 1, 384)), [0, 2], True), {})
cnt: 24, ((T([512, 384], f16), [0], True), {})

View File

@ -0,0 +1,239 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 4, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16)), {})
cnt: 3, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16)), {})
cnt: 3, ((T([128, 1280, 8, 8], f16), T([128, 1280, 8, 8], f16)), {})
cnt: 14, ((T([128, 768, 17, 17], f16), T([128, 768, 17, 17], f16)), {})
cnt: 5, ((T([128, 288, 35, 35], f16), T([128, 288, 35, 35], f16)), {})
cnt: 3, ((T([128, 256, 35, 35], f16), T([128, 256, 35, 35], f16)), {})
cnt: 3, ((T([128, 192, 35, 35], f16), T([128, 192, 35, 35], f16)), {})
Operator: aten.add_.Tensor
cnt: 94, ((T([], i64), 1), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 2048], f16), T([2048, 1000], f16, stride=(1, 2048))), {})
Operator: aten.avg_pool2d.default
cnt: 1, ((T([128, 192, 35, 35], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 1, ((T([128, 256, 35, 35], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 1, ((T([128, 288, 35, 35], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 4, ((T([128, 768, 17, 17], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), [3, 3], [1, 1], [1, 1]), {})
Operator: aten.avg_pool2d_backward.default
cnt: 1, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([128, 1280, 8, 8], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 4, ((T([128, 768, 17, 17], f16), T([128, 768, 17, 17], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 1, ((T([128, 288, 35, 35], f16), T([128, 288, 35, 35], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 1, ((T([128, 256, 35, 35], f16), T([128, 256, 35, 35], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 1, ((T([128, 192, 35, 35], f16), T([128, 192, 35, 35], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
Operator: aten.cat.default
cnt: 1, (([T([128, 64, 35, 35], f16), T([128, 64, 35, 35], f16), T([128, 96, 35, 35], f16), T([128, 32, 35, 35], f16)], 1), {})
cnt: 2, (([T([128, 64, 35, 35], f16), T([128, 64, 35, 35], f16), T([128, 96, 35, 35], f16), T([128, 64, 35, 35], f16)], 1), {})
cnt: 1, (([T([128, 384, 17, 17], f16), T([128, 96, 17, 17], f16), T([128, 288, 17, 17], f16)], 1), {})
cnt: 4, (([T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16)], 1), {})
cnt: 1, (([T([128, 320, 8, 8], f16), T([128, 192, 8, 8], f16), T([128, 768, 8, 8], f16)], 1), {})
cnt: 4, (([T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16)], 1), {})
cnt: 2, (([T([128, 320, 8, 8], f16), T([128, 768, 8, 8], f16), T([128, 768, 8, 8], f16), T([128, 192, 8, 8], f16)], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 299, 299], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 299, 299], f16), T([32, 3, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 149, 149], f16), T([32, 32, 3, 3], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([64, 32, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 73, 73], f16), T([80, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([192, 80, 3, 3], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 192, 35, 35], f16), T([64, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 35, 35], f16), T([48, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 48, 35, 35], f16), T([64, 48, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 64, 35, 35], f16), T([96, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 96, 35, 35], f16), T([96, 96, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 35, 35], f16), T([32, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 256, 35, 35], f16), T([64, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 35, 35], f16), T([48, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 288, 35, 35], f16), T([64, 288, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 288, 35, 35], f16), T([48, 288, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 288, 35, 35], f16), T([384, 288, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 96, 35, 35], f16), T([96, 96, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 12, ((T([128, 768, 17, 17], f16), T([192, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 768, 17, 17], f16), T([128, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 128, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 17, 17], f16), T([192, 128, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 128, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 17, 17], f16), T([192, 128, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 768, 17, 17], f16), T([160, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([160, 160, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 160, 17, 17], f16), T([192, 160, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([160, 160, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 160, 17, 17], f16), T([192, 160, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 192, 17, 17], f16), T([192, 192, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 192, 17, 17], f16), T([192, 192, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 17, 17], f16), T([320, 192, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 17, 17], f16), T([192, 192, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([320, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([384, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([384, 384, 1, 3], f16), None, [1, 1], [0, 1], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([384, 384, 3, 1], f16), None, [1, 1], [1, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([448, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 448, 8, 8], f16), T([384, 448, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([192, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([320, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([384, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([448, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([192, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 192, 8, 8], f16), T([128, 2048, 8, 8], f16), T([192, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16), T([384, 384, 3, 1], f16), [0], [1, 1], [1, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16), T([384, 384, 1, 3], f16), [0], [1, 1], [0, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 384, 8, 8], f16), T([128, 448, 8, 8], f16), T([384, 448, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 448, 8, 8], f16), T([128, 2048, 8, 8], f16), T([448, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 384, 8, 8], f16), T([128, 2048, 8, 8], f16), T([384, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 320, 8, 8], f16), T([128, 2048, 8, 8], f16), T([320, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 8, 8], f16), T([128, 1280, 8, 8], f16), T([192, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 448, 8, 8], f16), T([128, 1280, 8, 8], f16), T([448, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 384, 8, 8], f16), T([128, 1280, 8, 8], f16), T([384, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 320, 8, 8], f16), T([128, 1280, 8, 8], f16), T([320, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 8, 8], f16), T([128, 192, 17, 17], f16), T([192, 192, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([192, 192, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([192, 192, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 12, ((T([128, 192, 17, 17], f16), T([128, 768, 17, 17], f16), T([192, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 320, 8, 8], f16), T([128, 192, 17, 17], f16), T([320, 192, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 192, 17, 17], f16), T([128, 160, 17, 17], f16), T([192, 160, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([128, 160, 17, 17], f16), T([160, 160, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([128, 160, 17, 17], f16), T([160, 160, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([128, 768, 17, 17], f16), T([160, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 192, 17, 17], f16), T([128, 160, 17, 17], f16), T([192, 160, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 17, 17], f16), T([128, 128, 17, 17], f16), T([192, 128, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 128, 17, 17], f16), T([128, 128, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 128, 17, 17], f16), T([128, 128, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 768, 17, 17], f16), T([128, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 17, 17], f16), T([128, 128, 17, 17], f16), T([192, 128, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 96, 17, 17], f16), T([128, 96, 35, 35], f16), T([96, 96, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 96, 35, 35], f16), T([128, 64, 35, 35], f16), T([96, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 64, 35, 35], f16), T([128, 288, 35, 35], f16), T([64, 288, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 384, 17, 17], f16), T([128, 288, 35, 35], f16), T([384, 288, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 96, 35, 35], f16), T([128, 96, 35, 35], f16), T([96, 96, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 64, 35, 35], f16), T([128, 48, 35, 35], f16), T([64, 48, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 48, 35, 35], f16), T([128, 288, 35, 35], f16), T([48, 288, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 64, 35, 35], f16), T([128, 256, 35, 35], f16), T([64, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 48, 35, 35], f16), T([128, 256, 35, 35], f16), T([48, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 35, 35], f16), T([128, 192, 35, 35], f16), T([32, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 64, 35, 35], f16), T([128, 192, 35, 35], f16), T([64, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 48, 35, 35], f16), T([128, 192, 35, 35], f16), T([48, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 71, 71], f16), T([128, 80, 73, 73], f16), T([192, 80, 3, 3], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([128, 64, 73, 73], f16), T([80, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 147, 147], f16), T([128, 32, 147, 147], f16), T([64, 32, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([128, 32, 149, 149], f16), T([32, 32, 3, 3], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 149, 149], f16), T([128, 3, 299, 299], f16), T([32, 3, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 299, 299], f16), T([128, 3, 299, 299], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 2048, 8, 8], f16, stride=(2048, 1, 0, 0)), 64), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([128, 64, 147, 147], f16), [3, 3], [2, 2]), {})
cnt: 1, ((T([128, 192, 71, 71], f16), [3, 3], [2, 2]), {})
cnt: 1, ((T([128, 288, 35, 35], f16), [3, 3], [2, 2]), {})
cnt: 1, ((T([128, 768, 17, 17], f16), [3, 3], [2, 2]), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([128, 768, 8, 8], f16, stride=(81920, 64, 8, 1)), T([128, 768, 17, 17], f16), [3, 3], [2, 2], [0, 0], [1, 1], False, T([128, 768, 8, 8], i64)), {})
cnt: 1, ((T([128, 288, 17, 17], f16, stride=(221952, 289, 17, 1)), T([128, 288, 35, 35], f16), [3, 3], [2, 2], [0, 0], [1, 1], False, T([128, 288, 17, 17], i64)), {})
cnt: 1, ((T([128, 192, 35, 35], f16), T([128, 192, 71, 71], f16), [3, 3], [2, 2], [0, 0], [1, 1], False, T([128, 192, 35, 35], i64)), {})
cnt: 1, ((T([128, 64, 73, 73], f16), T([128, 64, 147, 147], f16), [3, 3], [2, 2], [0, 0], [1, 1], False, T([128, 64, 73, 73], i64)), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 2048, 8, 8], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 2048], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 2048], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([128, 32, 149, 149], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 64, 147, 147], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 192, 71, 71], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 0.001), {})
cnt: 12, ((T([128, 64, 35, 35], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 0.001), {})
cnt: 3, ((T([128, 48, 35, 35], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f16), True, 0.1, 0.001), {})
cnt: 7, ((T([128, 96, 35, 35], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 32, 35, 35], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 384, 17, 17], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 96, 17, 17], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f16), True, 0.1, 0.001), {})
cnt: 26, ((T([128, 192, 17, 17], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 0.001), {})
cnt: 6, ((T([128, 128, 17, 17], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 0.001), {})
cnt: 12, ((T([128, 160, 17, 17], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f16), True, 0.1, 0.001), {})
cnt: 3, ((T([128, 320, 8, 8], f16), T([320], f16), T([320], f16), T([320], f16), T([320], f16), True, 0.1, 0.001), {})
cnt: 3, ((T([128, 192, 8, 8], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 0.001), {})
cnt: 12, ((T([128, 384, 8, 8], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f16), True, 0.1, 0.001), {})
cnt: 2, ((T([128, 448, 8, 8], f16), T([448], f16), T([448], f16), T([448], f16), T([448], f16), True, 0.1, 0.001), {})
Operator: aten.native_batch_norm_backward.default
cnt: 3, ((T([128, 192, 8, 8], f16), T([128, 192, 8, 8], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 0.001, [True, True, True]), {})
cnt: 12, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f32), T([384], f32), True, 0.001, [True, True, True]), {})
cnt: 2, ((T([128, 448, 8, 8], f16), T([128, 448, 8, 8], f16), T([448], f16), T([448], f16), T([448], f16), T([448], f32), T([448], f32), True, 0.001, [True, True, True]), {})
cnt: 3, ((T([128, 320, 8, 8], f16), T([128, 320, 8, 8], f16), T([320], f16), T([320], f16), T([320], f16), T([320], f32), T([320], f32), True, 0.001, [True, True, True]), {})
cnt: 26, ((T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 0.001, [True, True, True]), {})
cnt: 12, ((T([128, 160, 17, 17], f16), T([128, 160, 17, 17], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f32), T([160], f32), True, 0.001, [True, True, True]), {})
cnt: 6, ((T([128, 128, 17, 17], f16), T([128, 128, 17, 17], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 96, 17, 17], f16), T([128, 96, 17, 17], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f32), T([96], f32), True, 0.001, [True, True, True]), {})
cnt: 7, ((T([128, 96, 35, 35], f16), T([128, 96, 35, 35], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f32), T([96], f32), True, 0.001, [True, True, True]), {})
cnt: 12, ((T([128, 64, 35, 35], f16), T([128, 64, 35, 35], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 384, 17, 17], f16), T([128, 384, 17, 17], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f32), T([384], f32), True, 0.001, [True, True, True]), {})
cnt: 3, ((T([128, 48, 35, 35], f16), T([128, 48, 35, 35], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f32), T([48], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 32, 35, 35], f16), T([128, 32, 35, 35], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 192, 71, 71], f16), T([128, 192, 71, 71], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([128, 80, 73, 73], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f32), T([80], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 64, 147, 147], f16), T([128, 64, 147, 147], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([128, 32, 147, 147], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 32, 149, 149], f16), T([128, 32, 149, 149], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 0.001, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 1, ((T([128, 32, 149, 149], f16),), {})
cnt: 1, ((T([128, 32, 147, 147], f16),), {})
cnt: 1, ((T([128, 64, 147, 147], f16),), {})
cnt: 1, ((T([128, 80, 73, 73], f16),), {})
cnt: 1, ((T([128, 192, 71, 71], f16),), {})
cnt: 12, ((T([128, 64, 35, 35], f16),), {})
cnt: 3, ((T([128, 48, 35, 35], f16),), {})
cnt: 7, ((T([128, 96, 35, 35], f16),), {})
cnt: 1, ((T([128, 32, 35, 35], f16),), {})
cnt: 1, ((T([128, 384, 17, 17], f16),), {})
cnt: 1, ((T([128, 96, 17, 17], f16),), {})
cnt: 26, ((T([128, 192, 17, 17], f16),), {})
cnt: 6, ((T([128, 128, 17, 17], f16),), {})
cnt: 12, ((T([128, 160, 17, 17], f16),), {})
cnt: 3, ((T([128, 320, 8, 8], f16),), {})
cnt: 3, ((T([128, 192, 8, 8], f16),), {})
cnt: 12, ((T([128, 384, 8, 8], f16),), {})
cnt: 2, ((T([128, 448, 8, 8], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 2, ((T([128, 192, 8, 8], f16, stride=(131072, 64, 8, 1)), T([128, 192, 8, 8], f16), 0), {})
cnt: 8, ((T([128, 384, 8, 8], f16, stride=(131072, 64, 8, 1)), T([128, 384, 8, 8], f16), 0), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16), 0), {})
cnt: 2, ((T([128, 448, 8, 8], f16), T([128, 448, 8, 8], f16), 0), {})
cnt: 2, ((T([128, 320, 8, 8], f16, stride=(131072, 64, 8, 1)), T([128, 320, 8, 8], f16), 0), {})
cnt: 1, ((T([128, 192, 8, 8], f16, stride=(81920, 64, 8, 1)), T([128, 192, 8, 8], f16), 0), {})
cnt: 10, ((T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), 0), {})
cnt: 1, ((T([128, 320, 8, 8], f16, stride=(81920, 64, 8, 1)), T([128, 320, 8, 8], f16), 0), {})
cnt: 16, ((T([128, 192, 17, 17], f16, stride=(221952, 289, 17, 1)), T([128, 192, 17, 17], f16), 0), {})
cnt: 12, ((T([128, 160, 17, 17], f16), T([128, 160, 17, 17], f16), 0), {})
cnt: 6, ((T([128, 128, 17, 17], f16), T([128, 128, 17, 17], f16), 0), {})
cnt: 1, ((T([128, 96, 17, 17], f16, stride=(221952, 289, 17, 1)), T([128, 96, 17, 17], f16), 0), {})
cnt: 4, ((T([128, 96, 35, 35], f16), T([128, 96, 35, 35], f16), 0), {})
cnt: 4, ((T([128, 64, 35, 35], f16), T([128, 64, 35, 35], f16), 0), {})
cnt: 1, ((T([128, 384, 17, 17], f16, stride=(221952, 289, 17, 1)), T([128, 384, 17, 17], f16), 0), {})
cnt: 6, ((T([128, 64, 35, 35], f16, stride=(352800, 1225, 35, 1)), T([128, 64, 35, 35], f16), 0), {})
cnt: 2, ((T([128, 96, 35, 35], f16, stride=(352800, 1225, 35, 1)), T([128, 96, 35, 35], f16), 0), {})
cnt: 3, ((T([128, 48, 35, 35], f16), T([128, 48, 35, 35], f16), 0), {})
cnt: 1, ((T([128, 32, 35, 35], f16, stride=(313600, 1225, 35, 1)), T([128, 32, 35, 35], f16), 0), {})
cnt: 1, ((T([128, 96, 35, 35], f16, stride=(313600, 1225, 35, 1)), T([128, 96, 35, 35], f16), 0), {})
cnt: 2, ((T([128, 64, 35, 35], f16, stride=(313600, 1225, 35, 1)), T([128, 64, 35, 35], f16), 0), {})
cnt: 1, ((T([128, 192, 71, 71], f16), T([128, 192, 71, 71], f16), 0), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([128, 80, 73, 73], f16), 0), {})
cnt: 1, ((T([128, 64, 147, 147], f16), T([128, 64, 147, 147], f16), 0), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([128, 32, 147, 147], f16), 0), {})
cnt: 1, ((T([128, 32, 149, 149], f16), T([128, 32, 149, 149], f16), 0), {})

View File

@ -0,0 +1,100 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([64, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([64, 12, 197, 197], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([64, 12, 197, 197], f16), T([64, 12, 197, 197], f16), -1, f16), {})
Operator: aten._unsafe_view.default
cnt: 36, ((T([64, 12, 197, 64], f16), [768, 197, 64]), {})
cnt: 12, ((T([64, 12, 64, 197], f16), [768, 64, 197]), {})
cnt: 12, ((T([768, 197, 197], f16), [64, 12, 197, 197]), {})
cnt: 12, ((T([768, 197, 64], f16), [64, 12, 197, 64]), {})
cnt: 12, ((T([64, 197, 12, 64], f16), [64, 197, 768]), {})
cnt: 12, ((T([64, 197, 3, 12, 64], f16), [64, 197, 2304]), {})
Operator: aten.add.Tensor
cnt: 12, ((T([64, 12, 197, 197], f16), T([1, 12, 197, 197], f16)), {})
cnt: 48, ((T([64, 197, 768], f16), T([64, 197, 768], f16)), {})
Operator: aten.addmm.default
cnt: 12, ((T([2304], f16), T([12608, 768], f16), T([768, 2304], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([12608, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([12608, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([12608, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([1000], f16), T([64, 768], f16), T([768, 1000], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 12, ((T([768, 197, 64], f16), T([768, 64, 197], f16)), {})
cnt: 12, ((T([768, 197, 197], f16), T([768, 197, 64], f16)), {})
cnt: 12, ((T([768, 197, 197], f16, stride=(38809, 1, 197)), T([768, 197, 64], f16)), {})
cnt: 12, ((T([768, 197, 64], f16), T([768, 64, 197], f16, stride=(12608, 1, 64))), {})
cnt: 12, ((T([768, 64, 197], f16, stride=(12608, 1, 64)), T([768, 197, 197], f16)), {})
cnt: 12, ((T([768, 197, 197], f16), T([768, 197, 64], f16, stride=(12608, 1, 197))), {})
Operator: aten.cat.default
cnt: 1, (([T([64, 1, 768], f16, stride=(0, 768, 1)), T([64, 196, 768], f16, stride=(150528, 1, 196))], 1), {})
cnt: 12, (([T([768], f16), T([768], f16), T([768], f16)],), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([768, 3, 16, 16], f16), T([768], f16), [16, 16], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([64, 768, 14, 14], f16, stride=(151296, 1, 10752, 768)), T([64, 3, 224, 224], f16), T([768, 3, 16, 16], f16), [768], [16, 16], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([64, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([64, 196, 768], f16, stride=(768, 0, 1)), 196), {})
Operator: aten.gelu.default
cnt: 12, ((T([64, 197, 3072], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([64, 197, 3072], f16), T([64, 197, 3072], f16)), {})
Operator: aten.index.Tensor
cnt: 12, ((T([732, 12], f16), [T([38809], i64)]), {})
Operator: aten.index_put.default
cnt: 12, ((T([732, 12], f16), [T([38809], i64)], T([38809, 12], f16, stride=(1, 38809)), True), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([64], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([64, 196, 768], f16, stride=(151296, 768, 1)), [1]), {})
Operator: aten.mm.default
cnt: 1, ((T([64, 1000], f16), T([1000, 768], f16)), {})
cnt: 1, ((T([1000, 64], f16, stride=(1, 1000)), T([64, 768], f16)), {})
cnt: 12, ((T([12608, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 12608], f16, stride=(1, 768)), T([12608, 3072], f16)), {})
cnt: 12, ((T([12608, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 12608], f16, stride=(1, 3072)), T([12608, 768], f16)), {})
cnt: 12, ((T([12608, 768], f16), T([768, 768], f16)), {})
cnt: 12, ((T([768, 12608], f16, stride=(1, 768)), T([12608, 768], f16)), {})
cnt: 12, ((T([12608, 2304], f16), T([2304, 768], f16)), {})
cnt: 12, ((T([2304, 12608], f16, stride=(1, 2304)), T([12608, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 12, ((T([64, 12, 197, 64], f16, stride=(453888, 64, 2304, 1)), 0.125), {})
cnt: 24, ((T([768], f16), T([64, 197, 768], f16)), {})
cnt: 24, ((T([64, 197, 768], f16), T([768], f16)), {})
cnt: 24, ((T([64, 197, 768], f16), T([64, 197, 768], f16)), {})
cnt: 12, ((T([64, 12, 197, 64], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 24, ((T([64, 197, 768], f16), [768], T([768], f16), T([768], f16), 1e-06), {})
cnt: 1, ((T([64, 768], f16), [768], T([768], f16), T([768], f16), 1e-06), {})
Operator: aten.native_layer_norm_backward.default
cnt: 1, ((T([64, 768], f16), T([64, 768], f16), [768], T([64, 1], f32), T([64, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
cnt: 24, ((T([64, 197, 768], f16), T([64, 197, 768], f16), [768], T([64, 197, 1], f32), T([64, 197, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.new_zeros.default
cnt: 12, ((T([38809, 12], f16, stride=(1, 38809)), [732, 12]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([64, 1000], f16), T([64], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([64, 1000], f16), T([64], i64), None, 1, -100), {})
Operator: aten.slice_backward.default
cnt: 1, ((T([64, 196, 768], f16), [64, 197, 768], 1, 1, 9223372036854775807, 1), {})
cnt: 1, ((T([64, 197, 768], f16), [64, 197, 768], 0, 0, 9223372036854775807, 1), {})
Operator: aten.stack.default
cnt: 12, (([T([64, 12, 197, 64], f16), T([64, 12, 197, 64], f16, stride=(151296, 12608, 1, 197)), T([64, 12, 197, 64], f16)],), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([64, 1000], f16), [0], True), {})
cnt: 24, ((T([64, 197, 768], f16), [0, 1], True), {})
cnt: 24, ((T([12608, 768], f16), [0], True), {})
cnt: 12, ((T([12608, 3072], f16), [0], True), {})
cnt: 12, ((T([64, 12, 197, 197], f16), [0], True), {})
cnt: 12, ((T([12608, 2304], f16), [0], True), {})
cnt: 1, ((T([64, 1, 768], f16, stride=(151296, 768, 1)), [0], True), {})
Operator: aten.unbind.int
cnt: 12, ((T([3, 64, 12, 197, 64], f16, stride=(768, 453888, 64, 2304, 1)),), {})

View File

@ -0,0 +1,244 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 2, ((T([512, 256, 256], f16), -1, False), {})
cnt: 1, ((T([512, 64, 64], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 1, ((T([512, 64, 64], f16), T([512, 64, 64], f16), -1, f16), {})
cnt: 2, ((T([512, 256, 256], f16), T([512, 256, 256], f16), -1, f16), {})
Operator: aten._unsafe_view.default
cnt: 3, ((T([128, 256, 16, 16], f16), [512, 64, 256]), {})
cnt: 2, ((T([512, 256, 256], f16), [512, 256, 256]), {})
cnt: 2, ((T([512, 16, 16, 64], f16), [131072, 64]), {})
cnt: 4, ((T([131072, 31], f16), [512, 16, 16, 31]), {})
cnt: 2, ((T([512, 16, 16, 16, 16], f16), [512, 256, 256]), {})
cnt: 1, ((T([512, 256, 64], f16), [512, 256, 64]), {})
cnt: 3, ((T([512, 64, 256], f16), [128, 256, 16, 16]), {})
cnt: 3, ((T([128, 512, 16, 16], f16), [512, 128, 256]), {})
cnt: 2, ((T([512, 16, 16, 128], f16), [131072, 128]), {})
cnt: 1, ((T([512, 256, 128], f16), [512, 256, 128]), {})
cnt: 3, ((T([512, 128, 256], f16), [128, 512, 16, 16]), {})
cnt: 3, ((T([128, 512, 8, 8], f16), [512, 128, 64]), {})
cnt: 1, ((T([512, 64, 64], f16), [512, 64, 64]), {})
cnt: 2, ((T([512, 8, 8, 128], f16), [32768, 128]), {})
cnt: 2, ((T([32768, 15], f16), [512, 8, 8, 15]), {})
cnt: 1, ((T([512, 8, 8, 8, 8], f16), [512, 64, 64]), {})
cnt: 1, ((T([512, 64, 128], f16), [512, 64, 128]), {})
cnt: 3, ((T([512, 128, 64], f16), [128, 512, 8, 8]), {})
cnt: 1, ((T([512, 8, 8, 128], f16), [512, 64, 128]), {})
cnt: 1, ((T([512, 16, 16, 128], f16), [512, 256, 128]), {})
cnt: 1, ((T([512, 16, 16, 64], f16), [512, 256, 64]), {})
Operator: aten.add.Tensor
cnt: 31, ((T([], i64), 1), {})
cnt: 4, ((T([128, 256, 64, 64], f16), T([128, 256, 64, 64], f16)), {})
cnt: 4, ((T([128, 512, 32, 32], f16), T([128, 512, 32, 32], f16)), {})
cnt: 4, ((T([128, 1024, 16, 16], f16), T([128, 1024, 16, 16], f16)), {})
cnt: 2, ((T([512, 16, 16, 16, 16], f16, stride=(8432, 31, 527, 1, 0)), T([512, 16, 16, 16, 16], f16, stride=(8432, 527, 31, 0, 1))), {})
cnt: 2, ((T([512, 256, 256], f16), T([512, 256, 256], f16)), {})
cnt: 3, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16)), {})
cnt: 1, ((T([512, 8, 8, 8, 8], f16, stride=(1080, 15, 135, 1, 0)), T([512, 8, 8, 8, 8], f16, stride=(1080, 135, 15, 0, 1))), {})
cnt: 1, ((T([512, 64, 64], f16), T([512, 64, 64], f16)), {})
cnt: 1, ((T([512, 8, 8, 128], f16, stride=(8192, 128, 1024, 1)), T([512, 8, 8, 128], f16)), {})
cnt: 1, ((T([512, 64, 128], f16), T([512, 64, 128], f16)), {})
cnt: 1, ((T([512, 16, 16, 128], f16, stride=(32768, 128, 2048, 1)), T([512, 16, 16, 128], f16)), {})
cnt: 1, ((T([512, 256, 128], f16), T([512, 256, 128], f16)), {})
cnt: 1, ((T([512, 16, 16, 64], f16, stride=(16384, 64, 1024, 1)), T([512, 16, 16, 64], f16)), {})
cnt: 1, ((T([512, 256, 64], f16), T([512, 256, 64], f16)), {})
cnt: 1, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 2048], f16), T([2048, 1000], f16, stride=(1, 2048))), {})
Operator: aten.avg_pool2d.default
cnt: 1, ((T([128, 512, 16, 16], f16), [2, 2], [2, 2]), {})
Operator: aten.avg_pool2d_backward.default
cnt: 1, ((T([128, 512, 8, 8], f16), T([128, 512, 16, 16], f16), [2, 2], [2, 2], [0, 0], False, True, None), {})
Operator: aten.bmm.default
cnt: 2, ((T([512, 256, 64], f16, stride=(16384, 1, 256)), T([512, 64, 256], f16)), {})
cnt: 2, ((T([512, 256, 256], f16), T([512, 256, 64], f16, stride=(16384, 1, 256))), {})
cnt: 2, ((T([512, 256, 128], f16, stride=(32768, 1, 256)), T([512, 128, 256], f16)), {})
cnt: 2, ((T([512, 256, 256], f16), T([512, 256, 128], f16, stride=(32768, 1, 256))), {})
cnt: 2, ((T([512, 64, 128], f16, stride=(8192, 1, 64)), T([512, 128, 64], f16)), {})
cnt: 2, ((T([512, 64, 64], f16), T([512, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 1, ((T([512, 64, 64], f16, stride=(4096, 1, 64)), T([512, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 1, ((T([512, 128, 64], f16), T([512, 64, 64], f16)), {})
cnt: 1, ((T([512, 256, 256], f16, stride=(65536, 1, 256)), T([512, 256, 128], f16, stride=(32768, 1, 256))), {})
cnt: 1, ((T([512, 128, 256], f16), T([512, 256, 256], f16)), {})
cnt: 1, ((T([512, 256, 256], f16, stride=(65536, 1, 256)), T([512, 256, 64], f16, stride=(16384, 1, 256))), {})
cnt: 1, ((T([512, 64, 256], f16), T([512, 256, 256], f16)), {})
Operator: aten.cat.default
cnt: 1, (([T([128, 512, 8, 8], f16), T([128, 512, 8, 8], f16), T([128, 512, 8, 8], f16)], 1), {})
cnt: 1, (([T([128, 512, 16, 16], f16), T([128, 512, 16, 16], f16), T([128, 512, 16, 16], f16)], 1), {})
cnt: 1, (([T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16)], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 256, 256], f16),), {})
Operator: aten.constant_pad_nd.default
cnt: 4, ((T([8192, 16, 31], f16), [0, 1], 0.0), {})
cnt: 4, ((T([8192, 512], f16), [0, 15], 0.0), {})
cnt: 2, ((T([4096, 8, 15], f16), [0, 1], 0.0), {})
cnt: 2, ((T([4096, 128], f16), [0, 7], 0.0), {})
cnt: 2, ((T([4096, 135], f16), [0, -7]), {})
cnt: 2, ((T([4096, 8, 16], f16), [0, -1]), {})
cnt: 4, ((T([8192, 527], f16), [0, -15]), {})
cnt: 4, ((T([8192, 16, 32], f16), [0, -1]), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 256, 256], f16), T([24, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 24, 128, 128], f16), T([32, 24, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([64, 32, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 64, 64], f16), T([64, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 64, 64, 64], f16), T([64, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 64, 64, 64], f16), T([256, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 64, 64], f16), T([64, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 64, 64], f16), T([128, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 128, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 128, 32, 32], f16), T([512, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 64, 64], f16), T([512, 256, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 32, 32], f16), T([128, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 32, 32], f16), T([128, 128, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 32, 32], f16), T([256, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([256, 256, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 256, 16, 16], f16), T([1024, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 32, 32], f16), T([1024, 512, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1024, 16, 16], f16), T([256, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([768, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1024, 16, 16], f16), T([512, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([1536, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 512, 8, 8], f16), T([2048, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1024, 16, 16], f16), T([2048, 1024, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([512, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 8, 8], f16), T([1536, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 2, ((T([128, 2048, 8, 8], f16), T([128, 512, 8, 8], f16), T([2048, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1536, 8, 8], f16), T([128, 512, 8, 8], f16), T([1536, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 512, 8, 8], f16), T([128, 2048, 8, 8], f16), T([512, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([128, 1024, 16, 16], f16), T([2048, 1024, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1536, 16, 16], f16), T([128, 512, 16, 16], f16), T([1536, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([128, 1024, 16, 16], f16), T([512, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 1024, 16, 16], f16), T([128, 256, 16, 16], f16), T([1024, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 768, 16, 16], f16), T([128, 256, 16, 16], f16), T([768, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([128, 1024, 16, 16], f16), T([256, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1024, 16, 16], f16), T([128, 512, 32, 32], f16), T([1024, 512, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([128, 256, 32, 32], f16), T([256, 256, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([128, 512, 32, 32], f16), T([256, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 512, 32, 32], f16), T([128, 128, 32, 32], f16), T([512, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16), T([128, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 32, 32], f16), T([128, 512, 32, 32], f16), T([128, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 512, 32, 32], f16), T([128, 256, 64, 64], f16), T([512, 256, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 32, 32], f16), T([128, 128, 64, 64], f16), T([128, 128, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 256, 64, 64], f16), T([128, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 256, 64, 64], f16), T([128, 64, 64, 64], f16), T([256, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16), T([64, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 64, 64], f16), T([128, 256, 64, 64], f16), T([64, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16), T([64, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 128, 128], f16), T([128, 32, 128, 128], f16), T([64, 32, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 24, 128, 128], f16), T([32, 24, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 24, 128, 128], f16), T([128, 3, 256, 256], f16), T([24, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 256, 256], f16), T([128, 3, 256, 256], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 2048, 8, 8], f16, stride=(2048, 1, 0, 0)), 64), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([128, 64, 128, 128], f16), [3, 3], [2, 2], [1, 1]), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([128, 64, 64, 64], f16), T([128, 64, 128, 128], f16), [3, 3], [2, 2], [1, 1], [1, 1], False, T([128, 64, 64, 64], i64)), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 2048, 8, 8], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 2, ((T([131072, 64], f16), T([64, 31], f16, stride=(1, 64))), {})
cnt: 2, ((T([131072, 128], f16), T([128, 31], f16, stride=(1, 128))), {})
cnt: 2, ((T([32768, 128], f16), T([128, 15], f16, stride=(1, 128))), {})
cnt: 1, ((T([128, 1000], f16), T([1000, 2048], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 2048], f16)), {})
cnt: 2, ((T([15, 32768], f16, stride=(1, 15)), T([32768, 128], f16)), {})
cnt: 2, ((T([32768, 15], f16), T([15, 128], f16)), {})
cnt: 2, ((T([31, 131072], f16, stride=(1, 31)), T([131072, 128], f16)), {})
cnt: 2, ((T([131072, 31], f16), T([31, 128], f16)), {})
cnt: 2, ((T([31, 131072], f16, stride=(1, 31)), T([131072, 64], f16)), {})
cnt: 2, ((T([131072, 31], f16), T([31, 64], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([512, 256, 256], f16), 0.125), {})
cnt: 2, ((T([512, 256, 256], f16), 0.08838834764831845), {})
cnt: 2, ((T([512, 64, 64], f16), 0.08838834764831845), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([128, 24, 128, 128], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 64, 128, 128], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 64, 64, 64], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 256, 64, 64], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 128, 32, 32], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 512, 32, 32], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 256, 16, 16], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 1024, 16, 16], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 512, 8, 8], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 2048, 8, 8], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 3, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f32), T([2048], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 512, 8, 8], f16), T([128, 512, 8, 8], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([128, 512, 16, 16], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 1024, 16, 16], f16), T([128, 1024, 16, 16], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([128, 256, 32, 32], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 512, 32, 32], f16), T([128, 512, 32, 32], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 128, 64, 64], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 256, 64, 64], f16), T([128, 256, 64, 64], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 64, 128, 128], f16), T([128, 64, 128, 128], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 32, 128, 128], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 24, 128, 128], f16), T([128, 24, 128, 128], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f32), T([24], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 1, ((T([128, 24, 128, 128], f16),), {})
cnt: 1, ((T([128, 32, 128, 128], f16),), {})
cnt: 1, ((T([128, 64, 128, 128], f16),), {})
cnt: 4, ((T([128, 64, 64, 64], f16),), {})
cnt: 2, ((T([128, 256, 64, 64], f16),), {})
cnt: 1, ((T([128, 128, 64, 64], f16),), {})
cnt: 3, ((T([128, 128, 32, 32], f16),), {})
cnt: 2, ((T([128, 512, 32, 32], f16),), {})
cnt: 1, ((T([128, 256, 32, 32], f16),), {})
cnt: 3, ((T([128, 256, 16, 16], f16),), {})
cnt: 2, ((T([128, 1024, 16, 16], f16),), {})
cnt: 1, ((T([128, 512, 16, 16], f16),), {})
cnt: 3, ((T([128, 512, 8, 8], f16),), {})
cnt: 2, ((T([128, 2048, 8, 8], f16),), {})
Operator: aten.slice_backward.default
cnt: 2, ((T([4096, 8, 8], f16), [4096, 8, 15], 2, 7, 9223372036854775807, 1), {})
cnt: 2, ((T([4096, 8, 15], f16), [4096, 9, 15], 1, 0, 8, 1), {})
cnt: 2, ((T([4096, 9, 15], f16), [4096, 9, 15], 0, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([8192, 16, 16], f16), [8192, 16, 31], 2, 15, 9223372036854775807, 1), {})
cnt: 4, ((T([8192, 16, 31], f16), [8192, 17, 31], 1, 0, 16, 1), {})
cnt: 4, ((T([8192, 17, 31], f16), [8192, 17, 31], 0, 0, 9223372036854775807, 1), {})
Operator: aten.split_with_sizes.default
cnt: 1, ((T([128, 768, 16, 16], f16), [256, 256, 256], 1), {})
cnt: 1, ((T([128, 1536, 16, 16], f16), [512, 512, 512], 1), {})
cnt: 1, ((T([128, 1536, 8, 8], f16), [512, 512, 512], 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
cnt: 1, ((T([512, 8, 8, 8, 8], f16, stride=(4096, 64, 1, 512, 8)), [2], True), {})
cnt: 1, ((T([512, 8, 8, 8, 8], f16, stride=(4096, 512, 8, 64, 1)), [2], True), {})
cnt: 2, ((T([512, 16, 16, 16, 16], f16, stride=(65536, 256, 1, 4096, 16)), [2], True), {})
cnt: 2, ((T([512, 16, 16, 16, 16], f16, stride=(65536, 4096, 16, 256, 1)), [2], True), {})
Operator: aten.threshold_backward.default
cnt: 2, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16), 0), {})
cnt: 3, ((T([128, 512, 8, 8], f16), T([128, 512, 8, 8], f16), 0), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([128, 512, 16, 16], f16), 0), {})
cnt: 2, ((T([128, 1024, 16, 16], f16), T([128, 1024, 16, 16], f16), 0), {})
cnt: 3, ((T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16), 0), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([128, 256, 32, 32], f16), 0), {})
cnt: 2, ((T([128, 512, 32, 32], f16), T([128, 512, 32, 32], f16), 0), {})
cnt: 3, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16), 0), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 128, 64, 64], f16), 0), {})
cnt: 2, ((T([128, 256, 64, 64], f16), T([128, 256, 64, 64], f16), 0), {})
cnt: 4, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16), 0), {})
cnt: 1, ((T([128, 64, 128, 128], f16), T([128, 64, 128, 128], f16), 0), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 32, 128, 128], f16), 0), {})
cnt: 1, ((T([128, 24, 128, 128], f16), T([128, 24, 128, 128], f16), 0), {})

View File

@ -0,0 +1,149 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([2, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([2, 1000], f16), T([2, 1000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 36, ((T([2, 16, 576, 576], f16, stride=(5308416, 1, 9216, 16)), -1, False), {})
cnt: 2, ((T([2, 16, 1, 577], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 2, ((T([2, 16, 1, 577], f16), T([2, 16, 1, 577], f16), -1, f16), {})
cnt: 36, ((T([2, 16, 576, 576], f16, stride=(5308416, 1, 9216, 16)), T([2, 16, 576, 576], f16), -1, f16), {})
Operator: aten._unsafe_view.default
cnt: 108, ((T([2, 16, 576, 48], f16), [32, 576, 48]), {})
cnt: 36, ((T([2, 16, 48, 576], f16), [32, 48, 576]), {})
cnt: 36, ((T([32, 576, 576], f16), [2, 16, 576, 576]), {})
cnt: 144, ((T([2, 576, 576, 16], f16), [663552, 16]), {})
cnt: 72, ((T([663552, 16], f16), [2, 576, 576, 16]), {})
cnt: 72, ((T([2, 16, 576, 576], f16), [32, 576, 576]), {})
cnt: 36, ((T([32, 576, 48], f16), [2, 16, 576, 48]), {})
cnt: 36, ((T([2, 576, 16, 48], f16), [2, 576, 768]), {})
cnt: 2, ((T([2, 16, 48, 577], f16), [32, 48, 577]), {})
cnt: 2, ((T([32, 1, 577], f16), [2, 16, 1, 577]), {})
cnt: 2, ((T([2, 16, 577, 48], f16), [32, 577, 48]), {})
cnt: 2, ((T([32, 1, 48], f16), [2, 16, 1, 48]), {})
cnt: 2, ((T([2, 577, 16, 48], f16), [2, 577, 768]), {})
cnt: 2, ((T([2, 577, 768], f16), [1154, 768]), {})
cnt: 36, ((T([2, 576, 3, 16, 48], f16), [2, 576, 2304]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([2, 576, 768], f16, stride=(442368, 1, 576)), T([1, 576, 768], f16)), {})
cnt: 72, ((T([2, 576, 576, 16], f16), T([16], f16)), {})
cnt: 72, ((T([2, 576, 768], f16, stride=(442368, 1, 576)), T([2, 576, 768], f16)), {})
cnt: 1, ((T([2, 1, 768], f16, stride=(0, 768, 1)), T([2, 1, 768], f16)), {})
cnt: 4, ((T([2, 1, 768], f16), T([2, 1, 768], f16)), {})
cnt: 1, ((T([2, 1, 768], f16, stride=(443136, 768, 1)), T([2, 1, 768], f16)), {})
cnt: 4, ((T([2, 577, 768], f16), T([2, 577, 768], f16)), {})
cnt: 2, ((T([2, 1, 768], f16), T([2, 1, 768], f16, stride=(443136, 768, 1))), {})
cnt: 1, ((T([2, 576, 768], f16, stride=(443136, 768, 1)), T([2, 576, 768], f16, stride=(443136, 768, 1))), {})
cnt: 1, ((T([2, 576, 768], f16), T([2, 576, 768], f16, stride=(443136, 768, 1))), {})
cnt: 72, ((T([2, 576, 768], f16), T([2, 576, 768], f16)), {})
cnt: 72, ((T([3, 2, 16, 576, 48], f16), T([3, 2, 16, 576, 48], f16)), {})
Operator: aten.addmm.default
cnt: 36, ((T([2304], f16), T([1152, 768], f16), T([768, 2304], f16, stride=(1, 768))), {})
cnt: 36, ((T([768], f16), T([1152, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 36, ((T([3072], f16), T([1152, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 36, ((T([768], f16), T([1152, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 2, ((T([768], f16), T([2, 768], f16, stride=(443136, 1)), T([768, 768], f16, stride=(1, 768))), {})
cnt: 4, ((T([768], f16), T([1154, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 2, ((T([768], f16), T([2, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 2, ((T([3072], f16), T([2, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 2, ((T([768], f16), T([2, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([1000], f16), T([2, 768], f16, stride=(443136, 1)), T([768, 1000], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 36, ((T([32, 576, 48], f16), T([32, 48, 576], f16)), {})
cnt: 36, ((T([32, 576, 576], f16), T([32, 576, 48], f16)), {})
cnt: 2, ((T([32, 1, 48], f16), T([32, 48, 577], f16)), {})
cnt: 2, ((T([32, 1, 577], f16), T([32, 577, 48], f16)), {})
cnt: 2, ((T([32, 577, 1], f16), T([32, 1, 48], f16)), {})
cnt: 2, ((T([32, 1, 48], f16), T([32, 48, 577], f16, stride=(27696, 1, 48))), {})
cnt: 2, ((T([32, 48, 1], f16), T([32, 1, 577], f16)), {})
cnt: 2, ((T([32, 1, 577], f16), T([32, 577, 48], f16, stride=(27696, 1, 577))), {})
cnt: 36, ((T([32, 576, 576], f16, stride=(331776, 1, 576)), T([32, 576, 48], f16)), {})
cnt: 36, ((T([32, 576, 48], f16), T([32, 48, 576], f16, stride=(27648, 1, 48))), {})
cnt: 36, ((T([32, 48, 576], f16, stride=(27648, 1, 48)), T([32, 576, 576], f16)), {})
cnt: 36, ((T([32, 576, 576], f16), T([32, 576, 48], f16, stride=(27648, 1, 576))), {})
Operator: aten.cat.default
cnt: 1, (([T([2, 1, 768], f16, stride=(0, 768, 1)), T([2, 576, 768], f16, stride=(442368, 1, 576))], 1), {})
cnt: 2, (([T([2, 1, 768], f16), T([2, 576, 768], f16, stride=(442368, 1, 576))], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([2, 3, 384, 384], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([2, 3, 384, 384], f16), T([768, 3, 16, 16], f16), T([768], f16), [16, 16], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([2, 768, 24, 24], f16, stride=(442368, 1, 18432, 768)), T([2, 3, 384, 384], f16), T([768, 3, 16, 16], f16), [768], [16, 16], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
Operator: aten.copy_.default
cnt: 1, ((T([2, 3, 384, 384], f16), T([2, 3, 384, 384], f16)), {})
Operator: aten.gelu.default
cnt: 36, ((T([2, 576, 3072], f16),), {})
cnt: 2, ((T([2, 1, 3072], f16),), {})
Operator: aten.gelu_backward.default
cnt: 2, ((T([2, 1, 3072], f16), T([2, 1, 3072], f16)), {})
cnt: 36, ((T([2, 576, 3072], f16), T([2, 576, 3072], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([2], i64),), {})
Operator: aten.mm.default
cnt: 72, ((T([663552, 16], f16), T([16, 16], f16, stride=(1, 16))), {})
cnt: 1, ((T([2, 1000], f16), T([1000, 768], f16)), {})
cnt: 1, ((T([1000, 2], f16, stride=(1, 1000)), T([2, 768], f16, stride=(443136, 1))), {})
cnt: 2, ((T([2, 768], f16), T([768, 3072], f16)), {})
cnt: 2, ((T([768, 2], f16, stride=(1, 768)), T([2, 3072], f16)), {})
cnt: 2, ((T([2, 3072], f16), T([3072, 768], f16)), {})
cnt: 2, ((T([3072, 2], f16, stride=(1, 3072)), T([2, 768], f16)), {})
cnt: 4, ((T([2, 768], f16), T([768, 768], f16)), {})
cnt: 2, ((T([768, 2], f16, stride=(1, 768)), T([2, 768], f16)), {})
cnt: 4, ((T([1154, 768], f16), T([768, 768], f16)), {})
cnt: 4, ((T([768, 1154], f16, stride=(1, 768)), T([1154, 768], f16)), {})
cnt: 2, ((T([768, 2], f16, stride=(1, 768)), T([2, 768], f16, stride=(443136, 1))), {})
cnt: 36, ((T([1152, 768], f16), T([768, 3072], f16)), {})
cnt: 36, ((T([768, 1152], f16, stride=(1, 768)), T([1152, 3072], f16)), {})
cnt: 36, ((T([1152, 3072], f16), T([3072, 768], f16)), {})
cnt: 36, ((T([3072, 1152], f16, stride=(1, 3072)), T([1152, 768], f16)), {})
cnt: 36, ((T([1152, 768], f16), T([768, 768], f16)), {})
cnt: 36, ((T([768, 1152], f16, stride=(1, 768)), T([1152, 768], f16)), {})
cnt: 72, ((T([16, 663552], f16, stride=(1, 16)), T([663552, 16], f16)), {})
cnt: 72, ((T([663552, 16], f16), T([16, 16], f16)), {})
cnt: 36, ((T([1152, 2304], f16), T([2304, 768], f16)), {})
cnt: 36, ((T([2304, 1152], f16, stride=(1, 2304)), T([1152, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 36, ((T([2, 16, 576, 48], f16, stride=(1327104, 48, 2304, 1)), 0.14433756729740643), {})
cnt: 72, ((T([768], f16), T([2, 576, 768], f16)), {})
cnt: 4, ((T([2, 16, 1, 48], f16), 0.14433756729740643), {})
cnt: 4, ((T([768], f16), T([2, 1, 768], f16)), {})
cnt: 1, ((T([2, 1, 768], f16, stride=(443136, 768, 1)), T([768], f16)), {})
cnt: 1, ((T([2, 1, 768], f16, stride=(443136, 768, 1)), T([2, 1, 768], f16)), {})
cnt: 3, ((T([2, 1, 768], f16), T([768], f16)), {})
cnt: 3, ((T([2, 1, 768], f16), T([2, 1, 768], f16)), {})
cnt: 72, ((T([2, 576, 768], f16), T([768], f16)), {})
cnt: 72, ((T([2, 576, 768], f16), T([2, 576, 768], f16)), {})
cnt: 36, ((T([2, 16, 576, 48], f16), 0.14433756729740643), {})
Operator: aten.native_layer_norm.default
cnt: 72, ((T([2, 576, 768], f16, stride=(442368, 1, 576)), [768], T([768], f16), T([768], f16), 1e-06), {})
cnt: 3, ((T([2, 577, 768], f16), [768], T([768], f16), T([768], f16), 1e-06), {})
cnt: 2, ((T([2, 1, 768], f16), [768], T([768], f16), T([768], f16), 1e-06), {})
Operator: aten.native_layer_norm_backward.default
cnt: 3, ((T([2, 577, 768], f16), T([2, 577, 768], f16), [768], T([2, 577, 1], f32), T([2, 577, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
cnt: 2, ((T([2, 1, 768], f16), T([2, 1, 768], f16), [768], T([2, 1, 1], f32), T([2, 1, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
cnt: 72, ((T([2, 576, 768], f16), T([2, 576, 768], f16, stride=(442368, 1, 576)), [768], T([2, 576, 1], f32), T([2, 576, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([2, 1000], f16), T([2], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([2, 1000], f16), T([2], i64), None, 1, -100), {})
Operator: aten.select_backward.default
cnt: 3, ((T([2, 768], f16), [2, 577, 768], 1, 0), {})
cnt: 36, ((T([2, 16, 576, 48], f16), [3, 2, 16, 576, 48], 0, 2), {})
cnt: 36, ((T([2, 16, 576, 48], f16, stride=(442368, 27648, 1, 576)), [3, 2, 16, 576, 48], 0, 1), {})
cnt: 36, ((T([2, 16, 576, 48], f16), [3, 2, 16, 576, 48], 0, 0), {})
Operator: aten.slice_backward.default
cnt: 3, ((T([2, 577, 768], f16), [2, 577, 768], 0, 0, 9223372036854775807, 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([2, 1000], f16), [0], True), {})
cnt: 4, ((T([2, 1, 768], f16), [0, 1], True), {})
cnt: 6, ((T([2, 768], f16), [0], True), {})
cnt: 2, ((T([2, 3072], f16), [0], True), {})
cnt: 4, ((T([1154, 768], f16), [0], True), {})
cnt: 1, ((T([2, 1, 768], f16), [0], True), {})
cnt: 72, ((T([2, 576, 768], f16), [0, 1], True), {})
cnt: 72, ((T([1152, 768], f16), [0], True), {})
cnt: 36, ((T([1152, 3072], f16), [0], True), {})
cnt: 72, ((T([2, 576, 576, 16], f16, stride=(5308416, 576, 1, 331776)), [0, 1, 2], True), {})
cnt: 36, ((T([1152, 2304], f16), [0], True), {})
cnt: 1, ((T([2, 576, 768], f16), [0], True), {})

View File

@ -0,0 +1,348 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 2, ((T([128, 8, 3137, 8], f16, stride=(602304, 8, 192, 1)), 2, False), {})
cnt: 2, ((T([128, 8, 785, 16], f16, stride=(301440, 16, 384, 1)), 2, False), {})
cnt: 2, ((T([128, 8, 197, 40], f16, stride=(189120, 40, 960, 1)), 2, False), {})
cnt: 2, ((T([128, 8, 50, 64], f16, stride=(76800, 64, 1536, 1)), 2, False), {})
Operator: aten._softmax_backward_data.default
cnt: 2, ((T([128, 8, 50, 64], f16, stride=(25600, 3200, 1, 50)), T([128, 8, 50, 64], f16), 2, f16), {})
cnt: 2, ((T([128, 8, 197, 40], f16, stride=(63040, 7880, 1, 197)), T([128, 8, 197, 40], f16), 2, f16), {})
cnt: 2, ((T([128, 8, 785, 16], f16, stride=(100480, 12560, 1, 785)), T([128, 8, 785, 16], f16), 2, f16), {})
cnt: 2, ((T([128, 8, 3137, 8], f16, stride=(200768, 25096, 1, 3137)), T([128, 8, 3137, 8], f16), 2, f16), {})
Operator: aten._unsafe_view.default
cnt: 6, ((T([128, 8, 3137, 8], f16), [1024, 3137, 8]), {})
cnt: 2, ((T([1024, 8, 8], f16), [128, 8, 8, 8]), {})
cnt: 2, ((T([1024, 3137, 8], f16), [128, 8, 3137, 8]), {})
cnt: 2, ((T([128, 3137, 8, 8], f16), [128, 3137, 64]), {})
cnt: 6, ((T([128, 8, 785, 16], f16), [1024, 785, 16]), {})
cnt: 2, ((T([1024, 16, 16], f16), [128, 8, 16, 16]), {})
cnt: 2, ((T([1024, 785, 16], f16), [128, 8, 785, 16]), {})
cnt: 2, ((T([128, 785, 8, 16], f16), [128, 785, 128]), {})
cnt: 6, ((T([128, 8, 197, 40], f16), [1024, 197, 40]), {})
cnt: 2, ((T([1024, 40, 40], f16), [128, 8, 40, 40]), {})
cnt: 2, ((T([1024, 197, 40], f16), [128, 8, 197, 40]), {})
cnt: 2, ((T([128, 197, 8, 40], f16), [128, 197, 320]), {})
cnt: 6, ((T([128, 8, 50, 64], f16), [1024, 50, 64]), {})
cnt: 2, ((T([1024, 64, 64], f16), [128, 8, 64, 64]), {})
cnt: 2, ((T([1024, 50, 64], f16), [128, 8, 50, 64]), {})
cnt: 2, ((T([128, 50, 8, 64], f16), [128, 50, 512]), {})
cnt: 2, ((T([128, 50, 3, 8, 64], f16), [128, 50, 1536]), {})
cnt: 2, ((T([128, 197, 3, 8, 40], f16), [128, 197, 960]), {})
cnt: 2, ((T([128, 785, 3, 8, 16], f16), [128, 785, 384]), {})
cnt: 2, ((T([128, 3137, 3, 8, 8], f16), [128, 3137, 192]), {})
Operator: aten.add.Tensor
cnt: 2, ((T([128, 64, 56, 56], f16), T([128, 64, 56, 56], f16, stride=(200768, 1, 3584, 64))), {})
cnt: 6, ((T([128, 8, 3137, 8], f16), T([128, 8, 3137, 8], f16)), {})
cnt: 10, ((T([128, 3137, 64], f16), T([128, 3137, 64], f16)), {})
cnt: 2, ((T([128, 128, 28, 28], f16), T([128, 128, 28, 28], f16, stride=(100480, 1, 3584, 128))), {})
cnt: 6, ((T([128, 8, 785, 16], f16), T([128, 8, 785, 16], f16)), {})
cnt: 10, ((T([128, 785, 128], f16), T([128, 785, 128], f16)), {})
cnt: 2, ((T([128, 320, 14, 14], f16), T([128, 320, 14, 14], f16, stride=(63040, 1, 4480, 320))), {})
cnt: 6, ((T([128, 8, 197, 40], f16), T([128, 8, 197, 40], f16)), {})
cnt: 10, ((T([128, 197, 320], f16), T([128, 197, 320], f16)), {})
cnt: 2, ((T([128, 512, 7, 7], f16), T([128, 512, 7, 7], f16, stride=(25600, 1, 3584, 512))), {})
cnt: 6, ((T([128, 8, 50, 64], f16), T([128, 8, 50, 64], f16)), {})
cnt: 10, ((T([128, 50, 512], f16), T([128, 50, 512], f16)), {})
cnt: 4, ((T([3, 128, 8, 50, 64], f16), T([3, 128, 8, 50, 64], f16)), {})
cnt: 2, ((T([128, 512, 7, 7], f16, stride=(25600, 1, 3584, 512)), T([128, 512, 7, 7], f16, stride=(25088, 1, 3584, 512))), {})
cnt: 1, ((T([192, 1, 7, 7], f16), T([192, 1, 7, 7], f16)), {})
cnt: 2, ((T([192], f16), T([192], f16)), {})
cnt: 1, ((T([192, 1, 5, 5], f16), T([192, 1, 5, 5], f16)), {})
cnt: 2, ((T([128, 1, 3, 3], f16), T([128, 1, 3, 3], f16)), {})
cnt: 2, ((T([128], f16), T([128], f16)), {})
cnt: 1, ((T([512, 1, 3, 3], f16), T([512, 1, 3, 3], f16)), {})
cnt: 1, ((T([512], f16), T([512], f16)), {})
cnt: 4, ((T([3, 128, 8, 197, 40], f16), T([3, 128, 8, 197, 40], f16)), {})
cnt: 2, ((T([128, 320, 14, 14], f16, stride=(63040, 1, 4480, 320)), T([128, 320, 14, 14], f16, stride=(62720, 1, 4480, 320))), {})
cnt: 1, ((T([120, 1, 7, 7], f16), T([120, 1, 7, 7], f16)), {})
cnt: 2, ((T([120], f16), T([120], f16)), {})
cnt: 1, ((T([120, 1, 5, 5], f16), T([120, 1, 5, 5], f16)), {})
cnt: 1, ((T([80, 1, 3, 3], f16), T([80, 1, 3, 3], f16)), {})
cnt: 1, ((T([80], f16), T([80], f16)), {})
cnt: 1, ((T([320, 1, 3, 3], f16), T([320, 1, 3, 3], f16)), {})
cnt: 1, ((T([320], f16), T([320], f16)), {})
cnt: 4, ((T([3, 128, 8, 785, 16], f16), T([3, 128, 8, 785, 16], f16)), {})
cnt: 2, ((T([128, 128, 28, 28], f16, stride=(100480, 1, 3584, 128)), T([128, 128, 28, 28], f16, stride=(100352, 1, 3584, 128))), {})
cnt: 1, ((T([48, 1, 7, 7], f16), T([48, 1, 7, 7], f16)), {})
cnt: 2, ((T([48], f16), T([48], f16)), {})
cnt: 1, ((T([48, 1, 5, 5], f16), T([48, 1, 5, 5], f16)), {})
cnt: 1, ((T([32, 1, 3, 3], f16), T([32, 1, 3, 3], f16)), {})
cnt: 1, ((T([32], f16), T([32], f16)), {})
cnt: 4, ((T([3, 128, 8, 3137, 8], f16), T([3, 128, 8, 3137, 8], f16)), {})
cnt: 2, ((T([128, 64, 56, 56], f16, stride=(200768, 1, 3584, 64)), T([128, 64, 56, 56], f16, stride=(200704, 1, 3584, 64))), {})
cnt: 1, ((T([24, 1, 7, 7], f16), T([24, 1, 7, 7], f16)), {})
cnt: 2, ((T([24], f16), T([24], f16)), {})
cnt: 1, ((T([24, 1, 5, 5], f16), T([24, 1, 5, 5], f16)), {})
cnt: 1, ((T([16, 1, 3, 3], f16), T([16, 1, 3, 3], f16)), {})
cnt: 1, ((T([16], f16), T([16], f16)), {})
cnt: 1, ((T([64, 1, 3, 3], f16), T([64, 1, 3, 3], f16)), {})
cnt: 1, ((T([64], f16), T([64], f16)), {})
Operator: aten.addmm.default
cnt: 2, ((T([192], f16), T([401536, 64], f16), T([64, 192], f16, stride=(1, 64))), {})
cnt: 2, ((T([64], f16), T([401536, 64], f16), T([64, 64], f16, stride=(1, 64))), {})
cnt: 2, ((T([512], f16), T([401536, 64], f16), T([64, 512], f16, stride=(1, 64))), {})
cnt: 2, ((T([64], f16), T([401536, 512], f16), T([512, 64], f16, stride=(1, 512))), {})
cnt: 2, ((T([384], f16), T([100480, 128], f16), T([128, 384], f16, stride=(1, 128))), {})
cnt: 2, ((T([128], f16), T([100480, 128], f16), T([128, 128], f16, stride=(1, 128))), {})
cnt: 2, ((T([1024], f16), T([100480, 128], f16), T([128, 1024], f16, stride=(1, 128))), {})
cnt: 2, ((T([128], f16), T([100480, 1024], f16), T([1024, 128], f16, stride=(1, 1024))), {})
cnt: 2, ((T([960], f16), T([25216, 320], f16), T([320, 960], f16, stride=(1, 320))), {})
cnt: 2, ((T([320], f16), T([25216, 320], f16), T([320, 320], f16, stride=(1, 320))), {})
cnt: 2, ((T([1280], f16), T([25216, 320], f16), T([320, 1280], f16, stride=(1, 320))), {})
cnt: 2, ((T([320], f16), T([25216, 1280], f16), T([1280, 320], f16, stride=(1, 1280))), {})
cnt: 2, ((T([1536], f16), T([6400, 512], f16), T([512, 1536], f16, stride=(1, 512))), {})
cnt: 2, ((T([512], f16), T([6400, 512], f16), T([512, 512], f16, stride=(1, 512))), {})
cnt: 2, ((T([2048], f16), T([6400, 512], f16), T([512, 2048], f16, stride=(1, 512))), {})
cnt: 2, ((T([512], f16), T([6400, 2048], f16), T([2048, 512], f16, stride=(1, 2048))), {})
cnt: 1, ((T([1000], f16), T([128, 512], f16, stride=(25600, 1)), T([512, 1000], f16, stride=(1, 512))), {})
Operator: aten.bmm.default
cnt: 4, ((T([1024, 8, 3137], f16, stride=(25096, 1, 8)), T([1024, 3137, 8], f16)), {})
cnt: 4, ((T([1024, 3137, 8], f16), T([1024, 8, 8], f16)), {})
cnt: 4, ((T([1024, 16, 785], f16, stride=(12560, 1, 16)), T([1024, 785, 16], f16)), {})
cnt: 4, ((T([1024, 785, 16], f16), T([1024, 16, 16], f16)), {})
cnt: 4, ((T([1024, 40, 197], f16, stride=(7880, 1, 40)), T([1024, 197, 40], f16)), {})
cnt: 4, ((T([1024, 197, 40], f16), T([1024, 40, 40], f16)), {})
cnt: 4, ((T([1024, 64, 50], f16, stride=(3200, 1, 64)), T([1024, 50, 64], f16)), {})
cnt: 4, ((T([1024, 50, 64], f16), T([1024, 64, 64], f16)), {})
cnt: 2, ((T([1024, 50, 64], f16), T([1024, 64, 64], f16, stride=(4096, 1, 64))), {})
cnt: 2, ((T([1024, 64, 64], f16), T([1024, 64, 50], f16, stride=(3200, 1, 64))), {})
cnt: 2, ((T([1024, 197, 40], f16), T([1024, 40, 40], f16, stride=(1600, 1, 40))), {})
cnt: 2, ((T([1024, 40, 40], f16), T([1024, 40, 197], f16, stride=(7880, 1, 40))), {})
cnt: 2, ((T([1024, 785, 16], f16), T([1024, 16, 16], f16, stride=(256, 1, 16))), {})
cnt: 2, ((T([1024, 16, 16], f16), T([1024, 16, 785], f16, stride=(12560, 1, 16))), {})
cnt: 2, ((T([1024, 3137, 8], f16), T([1024, 8, 8], f16, stride=(64, 1, 8))), {})
cnt: 2, ((T([1024, 8, 8], f16), T([1024, 8, 3137], f16, stride=(25096, 1, 8))), {})
Operator: aten.cat.default
cnt: 1, (([T([128, 1, 64], f16, stride=(0, 64, 1)), T([128, 3136, 64], f16)], 1), {})
cnt: 2, (([T([128, 1, 64], f16, stride=(200768, 64, 1)), T([128, 3136, 64], f16, stride=(200704, 1, 3136))], 1), {})
cnt: 2, (([T([128, 16, 56, 56], f16), T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16)], 1), {})
cnt: 1, (([T([128, 1, 128], f16, stride=(0, 128, 1)), T([128, 784, 128], f16)], 1), {})
cnt: 2, (([T([128, 1, 128], f16, stride=(100480, 128, 1)), T([128, 784, 128], f16, stride=(100352, 1, 784))], 1), {})
cnt: 2, (([T([128, 32, 28, 28], f16), T([128, 48, 28, 28], f16), T([128, 48, 28, 28], f16)], 1), {})
cnt: 1, (([T([128, 1, 320], f16, stride=(0, 320, 1)), T([128, 196, 320], f16)], 1), {})
cnt: 2, (([T([128, 1, 320], f16, stride=(63040, 320, 1)), T([128, 196, 320], f16, stride=(62720, 1, 196))], 1), {})
cnt: 2, (([T([128, 80, 14, 14], f16), T([128, 120, 14, 14], f16), T([128, 120, 14, 14], f16)], 1), {})
cnt: 1, (([T([128, 1, 512], f16, stride=(0, 512, 1)), T([128, 49, 512], f16)], 1), {})
cnt: 2, (([T([128, 1, 512], f16, stride=(25600, 512, 1)), T([128, 49, 512], f16, stride=(25088, 1, 49))], 1), {})
cnt: 2, (([T([128, 128, 7, 7], f16), T([128, 192, 7, 7], f16), T([128, 192, 7, 7], f16)], 1), {})
cnt: 2, (([T([128, 128, 7, 7], f16, stride=(6272, 1, 896, 128)), T([128, 192, 7, 7], f16, stride=(9408, 1, 1344, 192)), T([128, 192, 7, 7], f16, stride=(9408, 1, 1344, 192))], 1), {})
cnt: 2, (([T([128, 80, 14, 14], f16, stride=(15680, 1, 1120, 80)), T([128, 120, 14, 14], f16, stride=(23520, 1, 1680, 120)), T([128, 120, 14, 14], f16, stride=(23520, 1, 1680, 120))], 1), {})
cnt: 2, (([T([128, 32, 28, 28], f16, stride=(25088, 1, 896, 32)), T([128, 48, 28, 28], f16, stride=(37632, 1, 1344, 48)), T([128, 48, 28, 28], f16, stride=(37632, 1, 1344, 48))], 1), {})
cnt: 2, (([T([128, 16, 56, 56], f16, stride=(50176, 1, 896, 16)), T([128, 24, 56, 56], f16, stride=(75264, 1, 1344, 24)), T([128, 24, 56, 56], f16, stride=(75264, 1, 1344, 24))], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 224, 224], f16),), {})
Operator: aten.constant_pad_nd.default
cnt: 2, ((T([128, 8, 3136, 8], f16, stride=(200704, 8, 64, 1)), [0, 0, 1, 0, 0, 0], 0.0), {})
cnt: 2, ((T([128, 8, 784, 16], f16, stride=(100352, 16, 128, 1)), [0, 0, 1, 0, 0, 0], 0.0), {})
cnt: 2, ((T([128, 8, 196, 40], f16, stride=(62720, 40, 320, 1)), [0, 0, 1, 0, 0, 0], 0.0), {})
cnt: 2, ((T([128, 8, 49, 64], f16, stride=(25088, 64, 512, 1)), [0, 0, 1, 0, 0, 0], 0.0), {})
cnt: 2, ((T([128, 8, 50, 64], f16, stride=(25600, 64, 512, 1)), [0, 0, -1, 0, 0, 0]), {})
cnt: 2, ((T([128, 8, 197, 40], f16, stride=(63040, 40, 320, 1)), [0, 0, -1, 0, 0, 0]), {})
cnt: 2, ((T([128, 8, 785, 16], f16, stride=(100480, 16, 128, 1)), [0, 0, -1, 0, 0, 0]), {})
cnt: 2, ((T([128, 8, 3137, 8], f16, stride=(200768, 8, 64, 1)), [0, 0, -1, 0, 0, 0]), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([64, 3, 4, 4], f16), T([64], f16), [4, 4], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 64, 56, 56], f16, stride=(200768, 1, 3584, 64)), T([64, 1, 3, 3], f16), T([64], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 2, ((T([128, 16, 56, 56], f16, stride=(602304, 1, 10752, 192)), T([16, 1, 3, 3], f16), T([16], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 16), {})
cnt: 2, ((T([128, 24, 56, 56], f16, stride=(602304, 1, 10752, 192)), T([24, 1, 5, 5], f16), T([24], f16), [1, 1], [2, 2], [1, 1], False, [0, 0], 24), {})
cnt: 2, ((T([128, 24, 56, 56], f16, stride=(602304, 1, 10752, 192)), T([24, 1, 7, 7], f16), T([24], f16), [1, 1], [3, 3], [1, 1], False, [0, 0], 24), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([128, 64, 2, 2], f16), T([128], f16), [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 128, 28, 28], f16, stride=(100480, 1, 3584, 128)), T([128, 1, 3, 3], f16), T([128], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 128), {})
cnt: 2, ((T([128, 32, 28, 28], f16, stride=(301440, 1, 10752, 384)), T([32, 1, 3, 3], f16), T([32], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 32), {})
cnt: 2, ((T([128, 48, 28, 28], f16, stride=(301440, 1, 10752, 384)), T([48, 1, 5, 5], f16), T([48], f16), [1, 1], [2, 2], [1, 1], False, [0, 0], 48), {})
cnt: 2, ((T([128, 48, 28, 28], f16, stride=(301440, 1, 10752, 384)), T([48, 1, 7, 7], f16), T([48], f16), [1, 1], [3, 3], [1, 1], False, [0, 0], 48), {})
cnt: 1, ((T([128, 128, 28, 28], f16), T([320, 128, 2, 2], f16), T([320], f16), [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 320, 14, 14], f16, stride=(63040, 1, 4480, 320)), T([320, 1, 3, 3], f16), T([320], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 320), {})
cnt: 2, ((T([128, 80, 14, 14], f16, stride=(189120, 1, 13440, 960)), T([80, 1, 3, 3], f16), T([80], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 80), {})
cnt: 2, ((T([128, 120, 14, 14], f16, stride=(189120, 1, 13440, 960)), T([120, 1, 5, 5], f16), T([120], f16), [1, 1], [2, 2], [1, 1], False, [0, 0], 120), {})
cnt: 2, ((T([128, 120, 14, 14], f16, stride=(189120, 1, 13440, 960)), T([120, 1, 7, 7], f16), T([120], f16), [1, 1], [3, 3], [1, 1], False, [0, 0], 120), {})
cnt: 1, ((T([128, 320, 14, 14], f16), T([512, 320, 2, 2], f16), T([512], f16), [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 512, 7, 7], f16, stride=(25600, 1, 3584, 512)), T([512, 1, 3, 3], f16), T([512], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 512), {})
cnt: 2, ((T([128, 128, 7, 7], f16, stride=(76800, 1, 10752, 1536)), T([128, 1, 3, 3], f16), T([128], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 128), {})
cnt: 2, ((T([128, 192, 7, 7], f16, stride=(76800, 1, 10752, 1536)), T([192, 1, 5, 5], f16), T([192], f16), [1, 1], [2, 2], [1, 1], False, [0, 0], 192), {})
cnt: 2, ((T([128, 192, 7, 7], f16, stride=(76800, 1, 10752, 1536)), T([192, 1, 7, 7], f16), T([192], f16), [1, 1], [3, 3], [1, 1], False, [0, 0], 192), {})
Operator: aten.convolution_backward.default
cnt: 2, ((T([128, 192, 7, 7], f16, stride=(25088, 1, 3584, 512)), T([128, 192, 7, 7], f16, stride=(76800, 1, 10752, 1536)), T([192, 1, 7, 7], f16), [192], [1, 1], [3, 3], [1, 1], False, [0, 0], 192, [True, True, True]), {})
cnt: 2, ((T([128, 192, 7, 7], f16, stride=(25088, 1, 3584, 512)), T([128, 192, 7, 7], f16, stride=(76800, 1, 10752, 1536)), T([192, 1, 5, 5], f16), [192], [1, 1], [2, 2], [1, 1], False, [0, 0], 192, [True, True, True]), {})
cnt: 2, ((T([128, 128, 7, 7], f16, stride=(25088, 1, 3584, 512)), T([128, 128, 7, 7], f16, stride=(76800, 1, 10752, 1536)), T([128, 1, 3, 3], f16), [128], [1, 1], [1, 1], [1, 1], False, [0, 0], 128, [True, True, True]), {})
cnt: 2, ((T([128, 512, 7, 7], f16, stride=(25600, 1, 3584, 512)), T([128, 512, 7, 7], f16, stride=(25600, 1, 3584, 512)), T([512, 1, 3, 3], f16), [512], [1, 1], [1, 1], [1, 1], False, [0, 0], 512, [True, True, True]), {})
cnt: 1, ((T([128, 512, 7, 7], f16, stride=(25088, 1, 3584, 512)), T([128, 320, 14, 14], f16), T([512, 320, 2, 2], f16), [512], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 120, 14, 14], f16, stride=(62720, 1, 4480, 320)), T([128, 120, 14, 14], f16, stride=(189120, 1, 13440, 960)), T([120, 1, 7, 7], f16), [120], [1, 1], [3, 3], [1, 1], False, [0, 0], 120, [True, True, True]), {})
cnt: 2, ((T([128, 120, 14, 14], f16, stride=(62720, 1, 4480, 320)), T([128, 120, 14, 14], f16, stride=(189120, 1, 13440, 960)), T([120, 1, 5, 5], f16), [120], [1, 1], [2, 2], [1, 1], False, [0, 0], 120, [True, True, True]), {})
cnt: 2, ((T([128, 80, 14, 14], f16, stride=(62720, 1, 4480, 320)), T([128, 80, 14, 14], f16, stride=(189120, 1, 13440, 960)), T([80, 1, 3, 3], f16), [80], [1, 1], [1, 1], [1, 1], False, [0, 0], 80, [True, True, True]), {})
cnt: 2, ((T([128, 320, 14, 14], f16, stride=(63040, 1, 4480, 320)), T([128, 320, 14, 14], f16, stride=(63040, 1, 4480, 320)), T([320, 1, 3, 3], f16), [320], [1, 1], [1, 1], [1, 1], False, [0, 0], 320, [True, True, True]), {})
cnt: 1, ((T([128, 320, 14, 14], f16, stride=(62720, 1, 4480, 320)), T([128, 128, 28, 28], f16), T([320, 128, 2, 2], f16), [320], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 48, 28, 28], f16, stride=(100352, 1, 3584, 128)), T([128, 48, 28, 28], f16, stride=(301440, 1, 10752, 384)), T([48, 1, 7, 7], f16), [48], [1, 1], [3, 3], [1, 1], False, [0, 0], 48, [True, True, True]), {})
cnt: 2, ((T([128, 48, 28, 28], f16, stride=(100352, 1, 3584, 128)), T([128, 48, 28, 28], f16, stride=(301440, 1, 10752, 384)), T([48, 1, 5, 5], f16), [48], [1, 1], [2, 2], [1, 1], False, [0, 0], 48, [True, True, True]), {})
cnt: 2, ((T([128, 32, 28, 28], f16, stride=(100352, 1, 3584, 128)), T([128, 32, 28, 28], f16, stride=(301440, 1, 10752, 384)), T([32, 1, 3, 3], f16), [32], [1, 1], [1, 1], [1, 1], False, [0, 0], 32, [True, True, True]), {})
cnt: 2, ((T([128, 128, 28, 28], f16, stride=(100480, 1, 3584, 128)), T([128, 128, 28, 28], f16, stride=(100480, 1, 3584, 128)), T([128, 1, 3, 3], f16), [128], [1, 1], [1, 1], [1, 1], False, [0, 0], 128, [True, True, True]), {})
cnt: 1, ((T([128, 128, 28, 28], f16, stride=(100352, 1, 3584, 128)), T([128, 64, 56, 56], f16), T([128, 64, 2, 2], f16), [128], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 24, 56, 56], f16, stride=(200704, 1, 3584, 64)), T([128, 24, 56, 56], f16, stride=(602304, 1, 10752, 192)), T([24, 1, 7, 7], f16), [24], [1, 1], [3, 3], [1, 1], False, [0, 0], 24, [True, True, True]), {})
cnt: 2, ((T([128, 24, 56, 56], f16, stride=(200704, 1, 3584, 64)), T([128, 24, 56, 56], f16, stride=(602304, 1, 10752, 192)), T([24, 1, 5, 5], f16), [24], [1, 1], [2, 2], [1, 1], False, [0, 0], 24, [True, True, True]), {})
cnt: 2, ((T([128, 16, 56, 56], f16, stride=(200704, 1, 3584, 64)), T([128, 16, 56, 56], f16, stride=(602304, 1, 10752, 192)), T([16, 1, 3, 3], f16), [16], [1, 1], [1, 1], [1, 1], False, [0, 0], 16, [True, True, True]), {})
cnt: 2, ((T([128, 64, 56, 56], f16, stride=(200768, 1, 3584, 64)), T([128, 64, 56, 56], f16, stride=(200768, 1, 3584, 64)), T([64, 1, 3, 3], f16), [64], [1, 1], [1, 1], [1, 1], False, [0, 0], 64, [True, True, True]), {})
cnt: 1, ((T([128, 64, 56, 56], f16, stride=(200704, 1, 3584, 64)), T([128, 3, 224, 224], f16), T([64, 3, 4, 4], f16), [64], [4, 4], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([128, 3, 224, 224], f16)), {})
Operator: aten.gelu.default
cnt: 2, ((T([128, 3137, 512], f16),), {})
cnt: 2, ((T([128, 785, 1024], f16),), {})
cnt: 2, ((T([128, 197, 1280], f16),), {})
cnt: 2, ((T([128, 50, 2048], f16),), {})
Operator: aten.gelu_backward.default
cnt: 2, ((T([128, 50, 2048], f16), T([128, 50, 2048], f16)), {})
cnt: 2, ((T([128, 197, 1280], f16), T([128, 197, 1280], f16)), {})
cnt: 2, ((T([128, 785, 1024], f16), T([128, 785, 1024], f16)), {})
cnt: 2, ((T([128, 3137, 512], f16), T([128, 3137, 512], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 512], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 512], f16, stride=(25600, 1))), {})
cnt: 2, ((T([6400, 512], f16), T([512, 2048], f16)), {})
cnt: 2, ((T([512, 6400], f16, stride=(1, 512)), T([6400, 2048], f16)), {})
cnt: 2, ((T([6400, 2048], f16), T([2048, 512], f16)), {})
cnt: 2, ((T([2048, 6400], f16, stride=(1, 2048)), T([6400, 512], f16)), {})
cnt: 2, ((T([6400, 512], f16), T([512, 512], f16)), {})
cnt: 2, ((T([512, 6400], f16, stride=(1, 512)), T([6400, 512], f16)), {})
cnt: 2, ((T([6400, 1536], f16), T([1536, 512], f16)), {})
cnt: 2, ((T([1536, 6400], f16, stride=(1, 1536)), T([6400, 512], f16)), {})
cnt: 2, ((T([25216, 320], f16), T([320, 1280], f16)), {})
cnt: 2, ((T([320, 25216], f16, stride=(1, 320)), T([25216, 1280], f16)), {})
cnt: 2, ((T([25216, 1280], f16), T([1280, 320], f16)), {})
cnt: 2, ((T([1280, 25216], f16, stride=(1, 1280)), T([25216, 320], f16)), {})
cnt: 2, ((T([25216, 320], f16), T([320, 320], f16)), {})
cnt: 2, ((T([320, 25216], f16, stride=(1, 320)), T([25216, 320], f16)), {})
cnt: 2, ((T([25216, 960], f16), T([960, 320], f16)), {})
cnt: 2, ((T([960, 25216], f16, stride=(1, 960)), T([25216, 320], f16)), {})
cnt: 2, ((T([100480, 128], f16), T([128, 1024], f16)), {})
cnt: 2, ((T([128, 100480], f16, stride=(1, 128)), T([100480, 1024], f16)), {})
cnt: 2, ((T([100480, 1024], f16), T([1024, 128], f16)), {})
cnt: 2, ((T([1024, 100480], f16, stride=(1, 1024)), T([100480, 128], f16)), {})
cnt: 2, ((T([100480, 128], f16), T([128, 128], f16)), {})
cnt: 2, ((T([128, 100480], f16, stride=(1, 128)), T([100480, 128], f16)), {})
cnt: 2, ((T([100480, 384], f16), T([384, 128], f16)), {})
cnt: 2, ((T([384, 100480], f16, stride=(1, 384)), T([100480, 128], f16)), {})
cnt: 2, ((T([401536, 64], f16), T([64, 512], f16)), {})
cnt: 2, ((T([64, 401536], f16, stride=(1, 64)), T([401536, 512], f16)), {})
cnt: 2, ((T([401536, 512], f16), T([512, 64], f16)), {})
cnt: 2, ((T([512, 401536], f16, stride=(1, 512)), T([401536, 64], f16)), {})
cnt: 2, ((T([401536, 64], f16), T([64, 64], f16)), {})
cnt: 2, ((T([64, 401536], f16, stride=(1, 64)), T([401536, 64], f16)), {})
cnt: 2, ((T([401536, 192], f16), T([192, 64], f16)), {})
cnt: 2, ((T([192, 401536], f16, stride=(1, 192)), T([401536, 64], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([128, 8, 3136, 8], f16, stride=(602304, 8, 192, 1)), T([128, 8, 3136, 8], f16, stride=(200704, 25088, 1, 3136))), {})
cnt: 2, ((T([128, 8, 3137, 8], f16), 0.3535533905932738), {})
cnt: 2, ((T([128, 8, 784, 16], f16, stride=(301440, 16, 384, 1)), T([128, 8, 784, 16], f16, stride=(100352, 12544, 1, 784))), {})
cnt: 2, ((T([128, 8, 785, 16], f16), 0.25), {})
cnt: 2, ((T([128, 8, 196, 40], f16, stride=(189120, 40, 960, 1)), T([128, 8, 196, 40], f16, stride=(62720, 7840, 1, 196))), {})
cnt: 2, ((T([128, 8, 197, 40], f16), 0.15811388300841897), {})
cnt: 2, ((T([128, 8, 49, 64], f16, stride=(76800, 64, 1536, 1)), T([128, 8, 49, 64], f16, stride=(25088, 3136, 1, 49))), {})
cnt: 2, ((T([128, 8, 50, 64], f16), 0.125), {})
cnt: 2, ((T([128, 8, 50, 64], f16, stride=(25600, 64, 512, 1)), 0.125), {})
cnt: 2, ((T([128, 8, 49, 64], f16, stride=(25088, 64, 512, 1)), T([128, 8, 49, 64], f16, stride=(76800, 64, 1536, 1))), {})
cnt: 2, ((T([128, 8, 49, 64], f16, stride=(25088, 64, 512, 1)), T([128, 8, 49, 64], f16, stride=(25088, 3136, 1, 49))), {})
cnt: 2, ((T([128, 8, 197, 40], f16, stride=(63040, 40, 320, 1)), 0.15811388300841897), {})
cnt: 2, ((T([128, 8, 196, 40], f16, stride=(62720, 40, 320, 1)), T([128, 8, 196, 40], f16, stride=(189120, 40, 960, 1))), {})
cnt: 2, ((T([128, 8, 196, 40], f16, stride=(62720, 40, 320, 1)), T([128, 8, 196, 40], f16, stride=(62720, 7840, 1, 196))), {})
cnt: 2, ((T([128, 8, 785, 16], f16, stride=(100480, 16, 128, 1)), 0.25), {})
cnt: 2, ((T([128, 8, 784, 16], f16, stride=(100352, 16, 128, 1)), T([128, 8, 784, 16], f16, stride=(301440, 16, 384, 1))), {})
cnt: 2, ((T([128, 8, 784, 16], f16, stride=(100352, 16, 128, 1)), T([128, 8, 784, 16], f16, stride=(100352, 12544, 1, 784))), {})
cnt: 2, ((T([128, 8, 3137, 8], f16, stride=(200768, 8, 64, 1)), 0.3535533905932738), {})
cnt: 2, ((T([128, 8, 3136, 8], f16, stride=(200704, 8, 64, 1)), T([128, 8, 3136, 8], f16, stride=(602304, 8, 192, 1))), {})
cnt: 2, ((T([128, 8, 3136, 8], f16, stride=(200704, 8, 64, 1)), T([128, 8, 3136, 8], f16, stride=(200704, 25088, 1, 3136))), {})
Operator: aten.native_layer_norm.default
cnt: 1, ((T([128, 3136, 64], f16, stride=(200704, 1, 3136)), [64], T([64], f16), T([64], f16), 1e-05), {})
cnt: 4, ((T([128, 3137, 64], f16), [64], T([64], f16), T([64], f16), 1e-06), {})
cnt: 1, ((T([128, 784, 128], f16, stride=(100352, 1, 784)), [128], T([128], f16), T([128], f16), 1e-05), {})
cnt: 4, ((T([128, 785, 128], f16), [128], T([128], f16), T([128], f16), 1e-06), {})
cnt: 1, ((T([128, 196, 320], f16, stride=(62720, 1, 196)), [320], T([320], f16), T([320], f16), 1e-05), {})
cnt: 4, ((T([128, 197, 320], f16), [320], T([320], f16), T([320], f16), 1e-06), {})
cnt: 1, ((T([128, 49, 512], f16, stride=(25088, 1, 49)), [512], T([512], f16), T([512], f16), 1e-05), {})
cnt: 5, ((T([128, 50, 512], f16), [512], T([512], f16), T([512], f16), 1e-06), {})
Operator: aten.native_layer_norm_backward.default
cnt: 5, ((T([128, 50, 512], f16), T([128, 50, 512], f16), [512], T([128, 50, 1], f32), T([128, 50, 1], f32), T([512], f16), T([512], f16), [True, True, True]), {})
cnt: 1, ((T([128, 49, 512], f16, stride=(25600, 512, 1)), T([128, 49, 512], f16, stride=(25088, 1, 49)), [512], T([128, 49, 1], f32), T([128, 49, 1], f32), T([512], f16), T([512], f16), [True, True, True]), {})
cnt: 4, ((T([128, 197, 320], f16), T([128, 197, 320], f16), [320], T([128, 197, 1], f32), T([128, 197, 1], f32), T([320], f16), T([320], f16), [True, True, True]), {})
cnt: 1, ((T([128, 196, 320], f16, stride=(63040, 320, 1)), T([128, 196, 320], f16, stride=(62720, 1, 196)), [320], T([128, 196, 1], f32), T([128, 196, 1], f32), T([320], f16), T([320], f16), [True, True, True]), {})
cnt: 4, ((T([128, 785, 128], f16), T([128, 785, 128], f16), [128], T([128, 785, 1], f32), T([128, 785, 1], f32), T([128], f16), T([128], f16), [True, True, True]), {})
cnt: 1, ((T([128, 784, 128], f16, stride=(100480, 128, 1)), T([128, 784, 128], f16, stride=(100352, 1, 784)), [128], T([128, 784, 1], f32), T([128, 784, 1], f32), T([128], f16), T([128], f16), [True, True, True]), {})
cnt: 4, ((T([128, 3137, 64], f16), T([128, 3137, 64], f16), [64], T([128, 3137, 1], f32), T([128, 3137, 1], f32), T([64], f16), T([64], f16), [True, True, True]), {})
cnt: 1, ((T([128, 3136, 64], f16, stride=(200768, 64, 1)), T([128, 3136, 64], f16, stride=(200704, 1, 3136)), [64], T([128, 3136, 1], f32), T([128, 3136, 1], f32), T([64], f16), T([64], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.select_backward.default
cnt: 1, ((T([128, 512], f16), [128, 50, 512], 1, 0), {})
cnt: 2, ((T([128, 8, 50, 64], f16), [3, 128, 8, 50, 64], 0, 2), {})
cnt: 2, ((T([128, 8, 50, 64], f16), [3, 128, 8, 50, 64], 0, 1), {})
cnt: 2, ((T([128, 8, 50, 64], f16), [3, 128, 8, 50, 64], 0, 0), {})
cnt: 2, ((T([128, 8, 197, 40], f16), [3, 128, 8, 197, 40], 0, 2), {})
cnt: 2, ((T([128, 8, 197, 40], f16), [3, 128, 8, 197, 40], 0, 1), {})
cnt: 2, ((T([128, 8, 197, 40], f16), [3, 128, 8, 197, 40], 0, 0), {})
cnt: 2, ((T([128, 8, 785, 16], f16), [3, 128, 8, 785, 16], 0, 2), {})
cnt: 2, ((T([128, 8, 785, 16], f16), [3, 128, 8, 785, 16], 0, 1), {})
cnt: 2, ((T([128, 8, 785, 16], f16), [3, 128, 8, 785, 16], 0, 0), {})
cnt: 2, ((T([128, 8, 3137, 8], f16), [3, 128, 8, 3137, 8], 0, 2), {})
cnt: 2, ((T([128, 8, 3137, 8], f16), [3, 128, 8, 3137, 8], 0, 1), {})
cnt: 2, ((T([128, 8, 3137, 8], f16), [3, 128, 8, 3137, 8], 0, 0), {})
Operator: aten.slice_backward.default
cnt: 5, ((T([128, 50, 512], f16), [128, 50, 512], 0, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 49, 64], f16, stride=(25088, 64, 512, 1)), [128, 8, 49, 64], 3, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 49, 64], f16), [128, 8, 50, 64], 2, 1, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 50, 64], f16), [128, 8, 50, 64], 1, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 50, 64], f16), [128, 8, 50, 64], 0, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 49, 512], f16), [128, 50, 512], 1, 1, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 1, 512], f16, stride=(25600, 512, 1)), [128, 50, 512], 1, 0, 1, 1), {})
cnt: 1, ((T([128, 196, 320], f16, stride=(62720, 1, 196)), [128, 196, 320], 2, 0, 9223372036854775807, 1), {})
cnt: 3, ((T([128, 196, 320], f16), [128, 197, 320], 1, 1, 9223372036854775807, 1), {})
cnt: 5, ((T([128, 197, 320], f16), [128, 197, 320], 0, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 196, 40], f16, stride=(62720, 40, 320, 1)), [128, 8, 196, 40], 3, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 196, 40], f16), [128, 8, 197, 40], 2, 1, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 197, 40], f16), [128, 8, 197, 40], 1, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 197, 40], f16), [128, 8, 197, 40], 0, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 1, 320], f16, stride=(63040, 320, 1)), [128, 197, 320], 1, 0, 1, 1), {})
cnt: 1, ((T([128, 784, 128], f16, stride=(100352, 1, 784)), [128, 784, 128], 2, 0, 9223372036854775807, 1), {})
cnt: 3, ((T([128, 784, 128], f16), [128, 785, 128], 1, 1, 9223372036854775807, 1), {})
cnt: 5, ((T([128, 785, 128], f16), [128, 785, 128], 0, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 784, 16], f16, stride=(100352, 16, 128, 1)), [128, 8, 784, 16], 3, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 784, 16], f16), [128, 8, 785, 16], 2, 1, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 785, 16], f16), [128, 8, 785, 16], 1, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 785, 16], f16), [128, 8, 785, 16], 0, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 1, 128], f16, stride=(100480, 128, 1)), [128, 785, 128], 1, 0, 1, 1), {})
cnt: 1, ((T([128, 3136, 64], f16, stride=(200704, 1, 3136)), [128, 3136, 64], 2, 0, 9223372036854775807, 1), {})
cnt: 3, ((T([128, 3136, 64], f16), [128, 3137, 64], 1, 1, 9223372036854775807, 1), {})
cnt: 5, ((T([128, 3137, 64], f16), [128, 3137, 64], 0, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 3136, 8], f16, stride=(200704, 8, 64, 1)), [128, 8, 3136, 8], 3, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 3136, 8], f16), [128, 8, 3137, 8], 2, 1, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 3137, 8], f16), [128, 8, 3137, 8], 1, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 8, 3137, 8], f16), [128, 8, 3137, 8], 0, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 1, 64], f16, stride=(200768, 64, 1)), [128, 3137, 64], 1, 0, 1, 1), {})
Operator: aten.split_with_sizes.default
cnt: 2, ((T([128, 64, 56, 56], f16, stride=(602304, 1, 10752, 192)), [16, 24, 24], 1), {})
cnt: 2, ((T([128, 128, 28, 28], f16, stride=(301440, 1, 10752, 384)), [32, 48, 48], 1), {})
cnt: 2, ((T([128, 320, 14, 14], f16, stride=(189120, 1, 13440, 960)), [80, 120, 120], 1), {})
cnt: 2, ((T([128, 512, 7, 7], f16, stride=(76800, 1, 10752, 1536)), [128, 192, 192], 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
cnt: 4, ((T([6400, 512], f16), [0], True), {})
cnt: 2, ((T([6400, 2048], f16), [0], True), {})
cnt: 2, ((T([6400, 1536], f16), [0], True), {})
cnt: 1, ((T([128, 1, 512], f16, stride=(25600, 512, 1)), [0], True), {})
cnt: 4, ((T([25216, 320], f16), [0], True), {})
cnt: 2, ((T([25216, 1280], f16), [0], True), {})
cnt: 2, ((T([25216, 960], f16), [0], True), {})
cnt: 1, ((T([128, 1, 320], f16, stride=(63040, 320, 1)), [0], True), {})
cnt: 4, ((T([100480, 128], f16), [0], True), {})
cnt: 2, ((T([100480, 1024], f16), [0], True), {})
cnt: 2, ((T([100480, 384], f16), [0], True), {})
cnt: 1, ((T([128, 1, 128], f16, stride=(100480, 128, 1)), [0], True), {})
cnt: 4, ((T([401536, 64], f16), [0], True), {})
cnt: 2, ((T([401536, 512], f16), [0], True), {})
cnt: 2, ((T([401536, 192], f16), [0], True), {})
cnt: 1, ((T([128, 1, 64], f16, stride=(200768, 64, 1)), [0], True), {})

View File

@ -0,0 +1,45 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([32, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([32, 1000], f16), T([32, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 64, ((T([32, 768, 32, 32], f16), T([32, 768, 32, 32], f16)), {})
Operator: aten.add_.Tensor
cnt: 65, ((T([], i64), 1), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([32, 768], f16), T([768, 1000], f16, stride=(1, 768))), {})
Operator: aten.clone.default
cnt: 1, ((T([32, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([32, 3, 224, 224], f16), T([768, 3, 7, 7], f16), T([768], f16), [7, 7], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 32, ((T([32, 768, 32, 32], f16), T([768, 1, 7, 7], f16), T([768], f16), [1, 1], [3, 3], [1, 1], False, [0, 0], 768), {})
cnt: 32, ((T([32, 768, 32, 32], f16), T([768, 768, 1, 1], f16), T([768], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 32, ((T([32, 768, 32, 32], f16), T([32, 768, 32, 32], f16), T([768, 768, 1, 1], f16), [768], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 32, ((T([32, 768, 32, 32], f16), T([32, 768, 32, 32], f16), T([768, 1, 7, 7], f16), [768], [1, 1], [3, 3], [1, 1], False, [0, 0], 768, [True, True, True]), {})
cnt: 1, ((T([32, 768, 32, 32], f16), T([32, 3, 224, 224], f16), T([768, 3, 7, 7], f16), [768], [7, 7], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
Operator: aten.copy_.default
cnt: 1, ((T([32, 3, 224, 224], f16), T([32, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([32, 768, 32, 32], f16, stride=(768, 1, 0, 0)), 1024), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([32], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([32, 768, 32, 32], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([32, 1000], f16), T([1000, 768], f16)), {})
cnt: 1, ((T([1000, 32], f16, stride=(1, 1000)), T([32, 768], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 65, ((T([32, 768, 32, 32], f16), T([768], f16), T([768], f16), T([768], f16), T([768], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 65, ((T([32, 768, 32, 32], f16), T([32, 768, 32, 32], f16), T([768], f16), T([768], f16), T([768], f16), T([768], f32), T([768], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([32, 1000], f16), T([32], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([32, 1000], f16), T([32], i64), None, 1, -100), {})
Operator: aten.relu.default
cnt: 65, ((T([32, 768, 32, 32], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([32, 1000], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 65, ((T([32, 768, 32, 32], f16), T([32, 768, 32, 32], f16), 0), {})

View File

@ -0,0 +1,210 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([32, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([32, 1000], f16), T([32, 1000], f16), 1, f16), {})
Operator: aten._unsafe_view.default
cnt: 3, ((T([100352, 512], f16), [32, 56, 56, 512]), {})
cnt: 3, ((T([100352, 128], f16), [32, 56, 56, 128]), {})
cnt: 3, ((T([25088, 1024], f16), [32, 28, 28, 1024]), {})
cnt: 3, ((T([25088, 256], f16), [32, 28, 28, 256]), {})
cnt: 27, ((T([6272, 2048], f16), [32, 14, 14, 2048]), {})
cnt: 27, ((T([6272, 512], f16), [32, 14, 14, 512]), {})
cnt: 3, ((T([1568, 4096], f16), [32, 7, 7, 4096]), {})
cnt: 3, ((T([1568, 1024], f16), [32, 7, 7, 1024]), {})
cnt: 3, ((T([32, 7, 7, 1024], f16), [1568, 1024]), {})
Operator: aten.add.Tensor
cnt: 3, ((T([32, 56, 56, 512], f16), T([512], f16)), {})
cnt: 3, ((T([32, 56, 56, 128], f16), T([128], f16)), {})
cnt: 7, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128))), {})
cnt: 1, ((T([32, 1, 56, 56], f16), 1e-06), {})
cnt: 1, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([128, 1, 1], f16)), {})
cnt: 3, ((T([32, 28, 28, 1024], f16), T([1024], f16)), {})
cnt: 3, ((T([32, 28, 28, 256], f16), T([256], f16)), {})
cnt: 7, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256))), {})
cnt: 1, ((T([32, 1, 28, 28], f16), 1e-06), {})
cnt: 1, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([256, 1, 1], f16)), {})
cnt: 27, ((T([32, 14, 14, 2048], f16), T([2048], f16)), {})
cnt: 27, ((T([32, 14, 14, 512], f16), T([512], f16)), {})
cnt: 55, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512))), {})
cnt: 1, ((T([32, 1, 14, 14], f16), 1e-06), {})
cnt: 1, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([512, 1, 1], f16)), {})
cnt: 3, ((T([32, 7, 7, 4096], f16), T([4096], f16)), {})
cnt: 3, ((T([32, 7, 7, 1024], f16), T([1024], f16)), {})
cnt: 3, ((T([32, 1024, 7, 7], f16, stride=(50176, 1, 7168, 1024)), T([32, 1024, 7, 7], f16, stride=(50176, 1, 7168, 1024))), {})
cnt: 3, ((T([32, 1024, 7, 7], f16), T([32, 1024, 7, 7], f16, stride=(50176, 1, 7168, 1024))), {})
cnt: 1, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([32, 512, 14, 14], f16)), {})
cnt: 1, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([32, 256, 28, 28], f16)), {})
cnt: 1, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([32, 128, 56, 56], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([32, 1024], f16), T([1024, 1000], f16, stride=(1, 1024))), {})
Operator: aten.as_strided_.default
cnt: 1, ((T([32, 1024, 1, 1], f16), [32, 1024, 1, 1], [1024, 1, 1024, 1024]), {})
Operator: aten.clone.default
cnt: 1, ((T([32, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([32, 3, 224, 224], f16), T([128, 3, 4, 4], f16), T([128], f16), [4, 4], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([128, 1, 7, 7], f16), T([128], f16), [1, 1], [3, 3], [1, 1], False, [0, 0], 128), {})
cnt: 1, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([256, 128, 2, 2], f16), T([256], f16), [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([256, 1, 7, 7], f16), T([256], f16), [1, 1], [3, 3], [1, 1], False, [0, 0], 256), {})
cnt: 1, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([512, 256, 2, 2], f16), T([512], f16), [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 27, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([512, 1, 7, 7], f16), T([512], f16), [1, 1], [3, 3], [1, 1], False, [0, 0], 512), {})
cnt: 1, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([1024, 512, 2, 2], f16), T([1024], f16), [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 1024, 7, 7], f16, stride=(50176, 1, 7168, 1024)), T([1024, 1, 7, 7], f16), T([1024], f16), [1, 1], [3, 3], [1, 1], False, [0, 0], 1024), {})
Operator: aten.convolution_backward.default
cnt: 3, ((T([32, 1024, 7, 7], f16, stride=(50176, 1, 7168, 1024)), T([32, 1024, 7, 7], f16, stride=(50176, 1, 7168, 1024)), T([1024, 1, 7, 7], f16), [1024], [1, 1], [3, 3], [1, 1], False, [0, 0], 1024, [True, True, True]), {})
cnt: 1, ((T([32, 1024, 7, 7], f16), T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([1024, 512, 2, 2], f16), [1024], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 27, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([512, 1, 7, 7], f16), [512], [1, 1], [3, 3], [1, 1], False, [0, 0], 512, [True, True, True]), {})
cnt: 1, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([512, 256, 2, 2], f16), [512], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([256, 1, 7, 7], f16), [256], [1, 1], [3, 3], [1, 1], False, [0, 0], 256, [True, True, True]), {})
cnt: 1, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([256, 128, 2, 2], f16), [256], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([128, 1, 7, 7], f16), [128], [1, 1], [3, 3], [1, 1], False, [0, 0], 128, [True, True, True]), {})
cnt: 1, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([32, 3, 224, 224], f16), T([128, 3, 4, 4], f16), [128], [4, 4], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
Operator: aten.copy_.default
cnt: 1, ((T([32, 3, 224, 224], f16), T([32, 3, 224, 224], f16)), {})
cnt: 1, ((T([32, 1024], f16), T([32, 1024], f16)), {})
cnt: 1, ((T([1024, 512, 2, 2], f16), T([1024, 512, 2, 2], f16, stride=(2048, 1, 1024, 512))), {})
cnt: 1, ((T([512, 256, 2, 2], f16), T([512, 256, 2, 2], f16, stride=(1024, 1, 512, 256))), {})
cnt: 1, ((T([256, 128, 2, 2], f16), T([256, 128, 2, 2], f16, stride=(512, 1, 256, 128))), {})
Operator: aten.div.Scalar
cnt: 1, ((T([32, 1024, 7, 7], f16, stride=(1024, 1, 0, 0)), 49), {})
cnt: 1, ((T([32, 512, 14, 14], f16, stride=(196, 0, 14, 1)), 512), {})
cnt: 1, ((T([32, 256, 28, 28], f16, stride=(784, 0, 28, 1)), 256), {})
cnt: 1, ((T([32, 128, 56, 56], f16, stride=(3136, 0, 56, 1)), 128), {})
Operator: aten.gelu.default
cnt: 3, ((T([32, 56, 56, 512], f16),), {})
cnt: 3, ((T([32, 28, 28, 1024], f16),), {})
cnt: 27, ((T([32, 14, 14, 2048], f16),), {})
cnt: 3, ((T([32, 7, 7, 4096], f16),), {})
Operator: aten.gelu_backward.default
cnt: 3, ((T([32, 7, 7, 4096], f16), T([32, 7, 7, 4096], f16)), {})
cnt: 27, ((T([32, 14, 14, 2048], f16), T([32, 14, 14, 2048], f16)), {})
cnt: 3, ((T([32, 28, 28, 1024], f16), T([32, 28, 28, 1024], f16)), {})
cnt: 3, ((T([32, 56, 56, 512], f16), T([32, 56, 56, 512], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([32], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([32, 1024, 7, 7], f16, stride=(50176, 1, 7168, 1024)), [-1, -2], True), {})
cnt: 1, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), [1], True), {})
cnt: 1, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), [1], True), {})
cnt: 1, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), [1], True), {})
Operator: aten.mm.default
cnt: 3, ((T([100352, 128], f16), T([128, 512], f16, stride=(1, 128))), {})
cnt: 3, ((T([100352, 512], f16), T([512, 128], f16, stride=(1, 512))), {})
cnt: 3, ((T([25088, 256], f16), T([256, 1024], f16, stride=(1, 256))), {})
cnt: 3, ((T([25088, 1024], f16), T([1024, 256], f16, stride=(1, 1024))), {})
cnt: 27, ((T([6272, 512], f16), T([512, 2048], f16, stride=(1, 512))), {})
cnt: 27, ((T([6272, 2048], f16), T([2048, 512], f16, stride=(1, 2048))), {})
cnt: 3, ((T([1568, 1024], f16), T([1024, 4096], f16, stride=(1, 1024))), {})
cnt: 3, ((T([1568, 4096], f16), T([4096, 1024], f16, stride=(1, 4096))), {})
cnt: 1, ((T([32, 1000], f16), T([1000, 1024], f16)), {})
cnt: 1, ((T([1000, 32], f16, stride=(1, 1000)), T([32, 1024], f16)), {})
cnt: 3, ((T([1024, 1568], f16, stride=(1, 1024)), T([1568, 4096], f16)), {})
cnt: 3, ((T([1568, 1024], f16), T([1024, 4096], f16)), {})
cnt: 3, ((T([4096, 1568], f16, stride=(1, 4096)), T([1568, 1024], f16)), {})
cnt: 3, ((T([1568, 4096], f16), T([4096, 1024], f16)), {})
cnt: 27, ((T([512, 6272], f16, stride=(1, 512)), T([6272, 2048], f16)), {})
cnt: 27, ((T([6272, 512], f16), T([512, 2048], f16)), {})
cnt: 27, ((T([2048, 6272], f16, stride=(1, 2048)), T([6272, 512], f16)), {})
cnt: 27, ((T([6272, 2048], f16), T([2048, 512], f16)), {})
cnt: 3, ((T([256, 25088], f16, stride=(1, 256)), T([25088, 1024], f16)), {})
cnt: 3, ((T([25088, 256], f16), T([256, 1024], f16)), {})
cnt: 3, ((T([1024, 25088], f16, stride=(1, 1024)), T([25088, 256], f16)), {})
cnt: 3, ((T([25088, 1024], f16), T([1024, 256], f16)), {})
cnt: 3, ((T([128, 100352], f16, stride=(1, 128)), T([100352, 512], f16)), {})
cnt: 3, ((T([100352, 128], f16), T([128, 512], f16)), {})
cnt: 3, ((T([512, 100352], f16, stride=(1, 512)), T([100352, 128], f16)), {})
cnt: 3, ((T([100352, 512], f16), T([512, 128], f16)), {})
Operator: aten.mul.Scalar
cnt: 1, ((T([32, 1, 14, 14], f16), -0.5), {})
cnt: 1, ((T([32, 1, 14, 14], f16), 0.00390625), {})
cnt: 1, ((T([32, 1, 28, 28], f16), -0.5), {})
cnt: 1, ((T([32, 1, 28, 28], f16), 0.0078125), {})
cnt: 1, ((T([32, 1, 56, 56], f16), -0.5), {})
cnt: 1, ((T([32, 1, 56, 56], f16), 0.015625), {})
Operator: aten.mul.Tensor
cnt: 6, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([1, 128, 1, 1], f16)), {})
cnt: 2, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([32, 1, 56, 56], f16)), {})
cnt: 2, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([128, 1, 1], f16)), {})
cnt: 6, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([1, 256, 1, 1], f16)), {})
cnt: 2, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([32, 1, 28, 28], f16)), {})
cnt: 2, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([256, 1, 1], f16)), {})
cnt: 54, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([1, 512, 1, 1], f16)), {})
cnt: 2, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([32, 1, 14, 14], f16)), {})
cnt: 2, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([512, 1, 1], f16)), {})
cnt: 3, ((T([32, 1024, 7, 7], f16, stride=(50176, 1, 7168, 1024)), T([1, 1024, 1, 1], f16)), {})
cnt: 3, ((T([32, 1024, 7, 7], f16), T([32, 1024, 7, 7], f16, stride=(50176, 1, 7168, 1024))), {})
cnt: 3, ((T([32, 1024, 7, 7], f16), T([1, 1024, 1, 1], f16)), {})
cnt: 29, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512))), {})
cnt: 1, ((T([32, 1, 14, 14], f16), T([32, 1, 14, 14], f16)), {})
cnt: 1, ((T([32, 1, 14, 14], f16), T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512))), {})
cnt: 5, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256))), {})
cnt: 1, ((T([32, 1, 28, 28], f16), T([32, 1, 28, 28], f16)), {})
cnt: 1, ((T([32, 1, 28, 28], f16), T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256))), {})
cnt: 5, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128))), {})
cnt: 1, ((T([32, 1, 56, 56], f16), T([32, 1, 56, 56], f16)), {})
cnt: 1, ((T([32, 1, 56, 56], f16), T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128))), {})
Operator: aten.native_layer_norm.default
cnt: 1, ((T([32, 56, 56, 128], f16, stride=(401408, 56, 1, 3136)), [128], T([128], f16), T([128], f16), 1e-06), {})
cnt: 3, ((T([32, 56, 56, 128], f16), [128], T([128], f16), T([128], f16), 1e-06), {})
cnt: 3, ((T([32, 28, 28, 256], f16), [256], T([256], f16), T([256], f16), 1e-06), {})
cnt: 27, ((T([32, 14, 14, 512], f16), [512], T([512], f16), T([512], f16), 1e-06), {})
cnt: 3, ((T([32, 7, 7, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-06), {})
cnt: 1, ((T([32, 1, 1, 1024], f16), [1024], T([1024], f16), T([1024], f16), 1e-06), {})
Operator: aten.native_layer_norm_backward.default
cnt: 1, ((T([32, 1, 1, 1024], f16), T([32, 1, 1, 1024], f16), [1024], T([32, 1, 1, 1], f32), T([32, 1, 1, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
cnt: 3, ((T([32, 7, 7, 1024], f16), T([32, 7, 7, 1024], f16), [1024], T([32, 7, 7, 1], f32), T([32, 7, 7, 1], f32), T([1024], f16), T([1024], f16), [True, True, True]), {})
cnt: 27, ((T([32, 14, 14, 512], f16), T([32, 14, 14, 512], f16), [512], T([32, 14, 14, 1], f32), T([32, 14, 14, 1], f32), T([512], f16), T([512], f16), [True, True, True]), {})
cnt: 3, ((T([32, 28, 28, 256], f16), T([32, 28, 28, 256], f16), [256], T([32, 28, 28, 1], f32), T([32, 28, 28, 1], f32), T([256], f16), T([256], f16), [True, True, True]), {})
cnt: 3, ((T([32, 56, 56, 128], f16), T([32, 56, 56, 128], f16), [128], T([32, 56, 56, 1], f32), T([32, 56, 56, 1], f32), T([128], f16), T([128], f16), [True, True, True]), {})
cnt: 1, ((T([32, 56, 56, 128], f16), T([32, 56, 56, 128], f16, stride=(401408, 56, 1, 3136)), [128], T([32, 56, 56, 1], f32), T([32, 56, 56, 1], f32), T([128], f16), T([128], f16), [True, True, True]), {})
Operator: aten.neg.default
cnt: 1, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)),), {})
cnt: 1, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)),), {})
cnt: 1, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)),), {})
Operator: aten.new_empty_strided.default
cnt: 1, ((T([1024, 512, 2, 2], f16, stride=(2048, 1, 1024, 512)), [1024, 512, 2, 2], [2048, 4, 2, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([512, 256, 2, 2], f16, stride=(1024, 1, 512, 256)), [512, 256, 2, 2], [1024, 4, 2, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([256, 128, 2, 2], f16, stride=(512, 1, 256, 128)), [256, 128, 2, 2], [512, 4, 2, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.new_zeros.default
cnt: 1, ((T([32, 1024], f16), [32768]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([32, 1000], f16), T([32], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([32, 1000], f16), T([32], i64), None, 1, -100), {})
Operator: aten.pow.Tensor_Scalar
cnt: 1, ((T([32, 1, 14, 14], f16), 3), {})
cnt: 1, ((T([32, 1, 28, 28], f16), 3), {})
cnt: 1, ((T([32, 1, 56, 56], f16), 3), {})
Operator: aten.rsqrt.default
cnt: 1, ((T([32, 1, 56, 56], f16),), {})
cnt: 1, ((T([32, 1, 28, 28], f16),), {})
cnt: 1, ((T([32, 1, 14, 14], f16),), {})
Operator: aten.slice_backward.default
cnt: 2, ((T([512], f16), [512], 0, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([256], f16), [256], 0, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128], f16), [128], 0, 0, 9223372036854775807, 1), {})
Operator: aten.sub.Tensor
cnt: 2, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([32, 1, 56, 56], f16)), {})
cnt: 2, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([32, 1, 28, 28], f16)), {})
cnt: 2, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), T([32, 1, 14, 14], f16)), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([32, 1000], f16), [0], True), {})
cnt: 3, ((T([32, 1024, 7, 7], f16), [0, 2, 3], True), {})
cnt: 3, ((T([32, 7, 7, 1024], f16, stride=(50176, 7, 1, 49)), [0, 1, 2], True), {})
cnt: 3, ((T([32, 7, 7, 4096], f16), [0, 1, 2], True), {})
cnt: 29, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), [0, 2, 3], True), {})
cnt: 2, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), [1], True), {})
cnt: 27, ((T([32, 14, 14, 512], f16), [0, 1, 2], True), {})
cnt: 27, ((T([32, 14, 14, 2048], f16), [0, 1, 2], True), {})
cnt: 5, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), [0, 2, 3], True), {})
cnt: 2, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), [1], True), {})
cnt: 3, ((T([32, 28, 28, 256], f16), [0, 1, 2], True), {})
cnt: 3, ((T([32, 28, 28, 1024], f16), [0, 1, 2], True), {})
cnt: 5, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), [0, 2, 3], True), {})
cnt: 2, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), [1], True), {})
cnt: 3, ((T([32, 56, 56, 128], f16), [0, 1, 2], True), {})
cnt: 3, ((T([32, 56, 56, 512], f16), [0, 1, 2], True), {})
Operator: aten.var_mean.correction
cnt: 1, ((T([32, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), [1]), {'correction': 0, 'keepdim': True})
cnt: 1, ((T([32, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), [1]), {'correction': 0, 'keepdim': True})
cnt: 1, ((T([32, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), [1]), {'correction': 0, 'keepdim': True})

View File

@ -0,0 +1,203 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([64, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 3, ((T([64, 4, 401, 401], f16), -1, False), {})
cnt: 9, ((T([64, 4, 197, 197], f16), -1, False), {})
cnt: 3, ((T([64, 4, 1, 197], f16), -1, False), {})
cnt: 3, ((T([64, 4, 1, 401], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 3, ((T([64, 4, 1, 401], f16), T([64, 4, 1, 401], f16), -1, f16), {})
cnt: 3, ((T([64, 4, 1, 197], f16), T([64, 4, 1, 197], f16), -1, f16), {})
cnt: 9, ((T([64, 4, 197, 197], f16), T([64, 4, 197, 197], f16), -1, f16), {})
cnt: 3, ((T([64, 4, 401, 401], f16), T([64, 4, 401, 401], f16), -1, f16), {})
Operator: aten._unsafe_view.default
cnt: 12, ((T([64, 4, 401, 32], f16), [256, 401, 32]), {})
cnt: 6, ((T([64, 4, 32, 401], f16), [256, 32, 401]), {})
cnt: 3, ((T([256, 401, 401], f16), [64, 4, 401, 401]), {})
cnt: 3, ((T([256, 401, 32], f16), [64, 4, 401, 32]), {})
cnt: 6, ((T([64, 401, 4, 32], f16), [64, 401, 128]), {})
cnt: 30, ((T([64, 4, 197, 64], f16), [256, 197, 64]), {})
cnt: 12, ((T([64, 4, 64, 197], f16), [256, 64, 197]), {})
cnt: 9, ((T([256, 197, 197], f16), [64, 4, 197, 197]), {})
cnt: 9, ((T([256, 197, 64], f16), [64, 4, 197, 64]), {})
cnt: 12, ((T([64, 197, 4, 64], f16), [64, 197, 256]), {})
cnt: 3, ((T([64, 256], f16), [64, 1, 256]), {})
cnt: 3, ((T([256, 1, 197], f16), [64, 4, 1, 197]), {})
cnt: 3, ((T([256, 1, 64], f16), [64, 4, 1, 64]), {})
cnt: 3, ((T([64, 128], f16), [64, 1, 128]), {})
cnt: 3, ((T([256, 1, 401], f16), [64, 4, 1, 401]), {})
cnt: 3, ((T([256, 1, 32], f16), [64, 4, 1, 32]), {})
cnt: 3, ((T([64, 401, 128], f16), [25664, 128]), {})
cnt: 3, ((T([64, 197, 256], f16), [12608, 256]), {})
cnt: 9, ((T([64, 197, 3, 4, 64], f16), [64, 197, 768]), {})
cnt: 3, ((T([64, 401, 3, 4, 32], f16), [64, 401, 384]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([64, 401, 128], f16), T([1, 401, 128], f16)), {})
cnt: 1, ((T([64, 197, 256], f16), T([1, 197, 256], f16)), {})
cnt: 27, ((T([64, 401, 128], f16), T([64, 401, 128], f16)), {})
cnt: 51, ((T([64, 197, 256], f16), T([64, 197, 256], f16)), {})
cnt: 3, ((T([64, 1, 256], f16), T([256], f16)), {})
cnt: 3, ((T([64, 1, 256], f16, stride=(50432, 256, 1)), T([64, 1, 256], f16)), {})
cnt: 3, ((T([64, 1, 128], f16), T([128], f16)), {})
cnt: 3, ((T([64, 1, 128], f16, stride=(51328, 128, 1)), T([64, 1, 128], f16)), {})
Operator: aten.addmm.default
cnt: 6, ((T([384], f16), T([25664, 128], f16), T([128, 384], f16, stride=(1, 128))), {})
cnt: 9, ((T([128], f16), T([25664, 128], f16), T([128, 128], f16, stride=(1, 128))), {})
cnt: 3, ((T([128], f16), T([25664, 384], f16), T([384, 128], f16, stride=(1, 384))), {})
cnt: 18, ((T([768], f16), T([12608, 256], f16), T([256, 768], f16, stride=(1, 256))), {})
cnt: 15, ((T([256], f16), T([12608, 256], f16), T([256, 256], f16, stride=(1, 256))), {})
cnt: 9, ((T([256], f16), T([12608, 768], f16), T([768, 256], f16, stride=(1, 768))), {})
cnt: 6, ((T([256], f16), T([64, 128], f16), T([128, 256], f16, stride=(1, 128))), {})
cnt: 6, ((T([128], f16), T([64, 256], f16), T([256, 128], f16, stride=(1, 256))), {})
cnt: 3, ((T([256], f16), T([64, 256], f16), T([256, 256], f16, stride=(1, 256))), {})
cnt: 3, ((T([128], f16), T([64, 128], f16), T([128, 128], f16, stride=(1, 128))), {})
cnt: 1, ((T([1000], f16), T([64, 128], f16, stride=(51328, 1)), T([128, 1000], f16, stride=(1, 128))), {})
cnt: 1, ((T([1000], f16), T([64, 256], f16, stride=(50432, 1)), T([256, 1000], f16, stride=(1, 256))), {})
Operator: aten.bmm.default
cnt: 3, ((T([256, 401, 32], f16), T([256, 32, 401], f16)), {})
cnt: 3, ((T([256, 401, 401], f16), T([256, 401, 32], f16)), {})
cnt: 9, ((T([256, 197, 64], f16), T([256, 64, 197], f16)), {})
cnt: 9, ((T([256, 197, 197], f16), T([256, 197, 64], f16)), {})
cnt: 3, ((T([256, 1, 64], f16), T([256, 64, 197], f16)), {})
cnt: 3, ((T([256, 1, 197], f16), T([256, 197, 64], f16)), {})
cnt: 3, ((T([256, 1, 32], f16), T([256, 32, 401], f16)), {})
cnt: 3, ((T([256, 1, 401], f16), T([256, 401, 32], f16)), {})
cnt: 3, ((T([256, 401, 1], f16), T([256, 1, 32], f16)), {})
cnt: 3, ((T([256, 1, 32], f16), T([256, 32, 401], f16, stride=(12832, 1, 32))), {})
cnt: 3, ((T([256, 32, 1], f16), T([256, 1, 401], f16)), {})
cnt: 3, ((T([256, 1, 401], f16), T([256, 401, 32], f16, stride=(12832, 1, 401))), {})
cnt: 3, ((T([256, 197, 1], f16), T([256, 1, 64], f16)), {})
cnt: 3, ((T([256, 1, 64], f16), T([256, 64, 197], f16, stride=(12608, 1, 64))), {})
cnt: 3, ((T([256, 64, 1], f16), T([256, 1, 197], f16)), {})
cnt: 3, ((T([256, 1, 197], f16), T([256, 197, 64], f16, stride=(12608, 1, 197))), {})
cnt: 9, ((T([256, 197, 197], f16, stride=(38809, 1, 197)), T([256, 197, 64], f16)), {})
cnt: 9, ((T([256, 197, 64], f16), T([256, 64, 197], f16, stride=(12608, 1, 64))), {})
cnt: 9, ((T([256, 64, 197], f16, stride=(12608, 1, 64)), T([256, 197, 197], f16)), {})
cnt: 9, ((T([256, 197, 197], f16), T([256, 197, 64], f16, stride=(12608, 1, 197))), {})
cnt: 3, ((T([256, 401, 401], f16, stride=(160801, 1, 401)), T([256, 401, 32], f16)), {})
cnt: 3, ((T([256, 401, 32], f16), T([256, 32, 401], f16, stride=(12832, 1, 32))), {})
cnt: 3, ((T([256, 32, 401], f16, stride=(12832, 1, 32)), T([256, 401, 401], f16)), {})
cnt: 3, ((T([256, 401, 401], f16), T([256, 401, 32], f16, stride=(12832, 1, 401))), {})
Operator: aten.cat.default
cnt: 1, (([T([64, 1, 128], f16, stride=(0, 128, 1)), T([64, 400, 128], f16, stride=(51200, 1, 400))], 1), {})
cnt: 1, (([T([64, 1, 256], f16, stride=(0, 256, 1)), T([64, 196, 256], f16, stride=(50176, 1, 196))], 1), {})
cnt: 6, (([T([64, 1, 256], f16), T([64, 196, 256], f16, stride=(50432, 256, 1))], 1), {})
cnt: 6, (([T([64, 1, 128], f16), T([64, 400, 128], f16, stride=(51328, 128, 1))], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 3, 240, 240], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([64, 3, 240, 240], f16), T([128, 3, 12, 12], f16), T([128], f16), [12, 12], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 3, 224, 224], f16), T([256, 3, 16, 16], f16), T([256], f16), [16, 16], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([64, 256, 14, 14], f16, stride=(50432, 1, 3584, 256)), T([64, 3, 224, 224], f16), T([256, 3, 16, 16], f16), [256], [16, 16], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
cnt: 1, ((T([64, 128, 20, 20], f16, stride=(51328, 1, 2560, 128)), T([64, 3, 240, 240], f16), T([128, 3, 12, 12], f16), [128], [12, 12], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 3, 240, 240], f16), T([64, 3, 240, 240], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([2, 64, 1000], f16, stride=(0, 1000, 1)), 2), {})
Operator: aten.gelu.default
cnt: 3, ((T([64, 401, 384], f16),), {})
cnt: 9, ((T([64, 197, 768], f16),), {})
cnt: 6, ((T([64, 1, 128], f16),), {})
cnt: 6, ((T([64, 1, 256], f16),), {})
Operator: aten.gelu_backward.default
cnt: 6, ((T([64, 1, 128], f16), T([64, 1, 128], f16)), {})
cnt: 6, ((T([64, 1, 256], f16), T([64, 1, 256], f16)), {})
cnt: 9, ((T([64, 197, 768], f16), T([64, 197, 768], f16)), {})
cnt: 3, ((T([64, 401, 384], f16), T([64, 401, 384], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([64], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([2, 64, 1000], f16), [0]), {})
Operator: aten.mm.default
cnt: 3, ((T([64, 256], f16, stride=(50432, 1)), T([256, 256], f16, stride=(1, 256))), {})
cnt: 3, ((T([64, 128], f16, stride=(51328, 1)), T([128, 128], f16, stride=(1, 128))), {})
cnt: 1, ((T([64, 1000], f16), T([1000, 256], f16)), {})
cnt: 1, ((T([1000, 64], f16, stride=(1, 1000)), T([64, 256], f16, stride=(50432, 1))), {})
cnt: 1, ((T([64, 1000], f16), T([1000, 128], f16)), {})
cnt: 1, ((T([1000, 64], f16, stride=(1, 1000)), T([64, 128], f16, stride=(51328, 1))), {})
cnt: 6, ((T([64, 256], f16, stride=(50432, 1)), T([256, 128], f16)), {})
cnt: 6, ((T([256, 64], f16, stride=(1, 50432)), T([64, 128], f16)), {})
cnt: 6, ((T([64, 128], f16), T([128, 128], f16)), {})
cnt: 3, ((T([128, 64], f16, stride=(1, 128)), T([64, 128], f16)), {})
cnt: 9, ((T([25664, 128], f16), T([128, 128], f16)), {})
cnt: 9, ((T([128, 25664], f16, stride=(1, 128)), T([25664, 128], f16)), {})
cnt: 3, ((T([128, 64], f16, stride=(1, 128)), T([64, 128], f16, stride=(51328, 1))), {})
cnt: 6, ((T([64, 128], f16, stride=(51328, 1)), T([128, 256], f16)), {})
cnt: 6, ((T([128, 64], f16, stride=(1, 51328)), T([64, 256], f16)), {})
cnt: 6, ((T([64, 256], f16), T([256, 256], f16)), {})
cnt: 3, ((T([256, 64], f16, stride=(1, 256)), T([64, 256], f16)), {})
cnt: 15, ((T([12608, 256], f16), T([256, 256], f16)), {})
cnt: 15, ((T([256, 12608], f16, stride=(1, 256)), T([12608, 256], f16)), {})
cnt: 3, ((T([256, 64], f16, stride=(1, 256)), T([64, 256], f16, stride=(50432, 1))), {})
cnt: 9, ((T([12608, 256], f16), T([256, 768], f16)), {})
cnt: 9, ((T([256, 12608], f16, stride=(1, 256)), T([12608, 768], f16)), {})
cnt: 18, ((T([12608, 768], f16), T([768, 256], f16)), {})
cnt: 18, ((T([768, 12608], f16, stride=(1, 768)), T([12608, 256], f16)), {})
cnt: 3, ((T([25664, 128], f16), T([128, 384], f16)), {})
cnt: 3, ((T([128, 25664], f16, stride=(1, 128)), T([25664, 384], f16)), {})
cnt: 6, ((T([25664, 384], f16), T([384, 128], f16)), {})
cnt: 6, ((T([384, 25664], f16, stride=(1, 384)), T([25664, 128], f16)), {})
Operator: aten.mul.Tensor
cnt: 6, ((T([64, 4, 401, 401], f16), 0.1767766952966369), {})
cnt: 18, ((T([64, 4, 197, 197], f16), 0.125), {})
cnt: 6, ((T([64, 4, 1, 197], f16), 0.125), {})
cnt: 6, ((T([64, 4, 1, 401], f16), 0.1767766952966369), {})
Operator: aten.native_layer_norm.default
cnt: 10, ((T([64, 401, 128], f16), [128], T([128], f16), T([128], f16), 1e-06), {})
cnt: 22, ((T([64, 197, 256], f16), [256], T([256], f16), T([256], f16), 1e-06), {})
cnt: 3, ((T([64, 1, 128], f16, stride=(51328, 128, 1)), [128], T([128], f16), T([128], f16), 1e-06), {})
cnt: 3, ((T([64, 1, 256], f16, stride=(50432, 256, 1)), [256], T([256], f16), T([256], f16), 1e-06), {})
cnt: 3, ((T([64, 1, 256], f16), [256], T([256], f16), T([256], f16), 1e-06), {})
cnt: 3, ((T([64, 1, 128], f16), [128], T([128], f16), T([128], f16), 1e-06), {})
Operator: aten.native_layer_norm_backward.default
cnt: 22, ((T([64, 197, 256], f16), T([64, 197, 256], f16), [256], T([64, 197, 1], f32), T([64, 197, 1], f32), T([256], f16), T([256], f16), [True, True, True]), {})
cnt: 10, ((T([64, 401, 128], f16), T([64, 401, 128], f16), [128], T([64, 401, 1], f32), T([64, 401, 1], f32), T([128], f16), T([128], f16), [True, True, True]), {})
cnt: 3, ((T([64, 1, 128], f16), T([64, 1, 128], f16), [128], T([64, 1, 1], f32), T([64, 1, 1], f32), T([128], f16), T([128], f16), [True, True, True]), {})
cnt: 3, ((T([64, 1, 256], f16), T([64, 1, 256], f16), [256], T([64, 1, 1], f32), T([64, 1, 1], f32), T([256], f16), T([256], f16), [True, True, True]), {})
cnt: 3, ((T([64, 1, 256], f16), T([64, 1, 256], f16, stride=(50432, 256, 1)), [256], T([64, 1, 1], f32), T([64, 1, 1], f32), T([256], f16), T([256], f16), [True, True, True]), {})
cnt: 3, ((T([64, 1, 128], f16), T([64, 1, 128], f16, stride=(51328, 128, 1)), [128], T([64, 1, 1], f32), T([64, 1, 1], f32), T([128], f16), T([128], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([64, 1000], f16), T([64], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([64, 1000], f16), T([64], i64), None, 1, -100), {})
Operator: aten.select_backward.default
cnt: 1, ((T([64, 256], f16), [64, 197, 256], 1, 0), {})
cnt: 1, ((T([64, 128], f16), [64, 401, 128], 1, 0), {})
Operator: aten.slice_backward.default
cnt: 16, ((T([64, 197, 256], f16), [64, 197, 256], 0, 0, 9223372036854775807, 1), {})
cnt: 16, ((T([64, 401, 128], f16), [64, 401, 128], 0, 0, 9223372036854775807, 1), {})
cnt: 6, ((T([64, 196, 256], f16, stride=(50432, 256, 1)), [64, 197, 256], 1, 1, 9223372036854775807, 1), {})
cnt: 3, ((T([64, 1, 128], f16), [64, 1, 128], 0, 0, 9223372036854775807, 1), {})
cnt: 9, ((T([64, 1, 128], f16), [64, 401, 128], 1, 0, 1, 1), {})
cnt: 6, ((T([64, 400, 128], f16, stride=(51328, 128, 1)), [64, 401, 128], 1, 1, 9223372036854775807, 1), {})
cnt: 3, ((T([64, 1, 256], f16), [64, 1, 256], 0, 0, 9223372036854775807, 1), {})
cnt: 9, ((T([64, 1, 256], f16), [64, 197, 256], 1, 0, 1, 1), {})
Operator: aten.stack.default
cnt: 1, (([T([64, 1000], f16), T([64, 1000], f16)],), {})
cnt: 9, (([T([64, 4, 197, 64], f16), T([64, 4, 197, 64], f16, stride=(50432, 12608, 1, 197)), T([64, 4, 197, 64], f16)],), {})
cnt: 3, (([T([64, 4, 401, 32], f16), T([64, 4, 401, 32], f16, stride=(51328, 12832, 1, 401)), T([64, 4, 401, 32], f16)],), {})
Operator: aten.sum.SymInt
cnt: 2, ((T([64, 1000], f16), [0], True), {})
cnt: 6, ((T([64, 256], f16, stride=(50432, 1)), [0], True), {})
cnt: 3, ((T([64, 128], f16), [0], True), {})
cnt: 12, ((T([25664, 128], f16), [0], True), {})
cnt: 3, ((T([64, 1, 128], f16), [0, 1], True), {})
cnt: 6, ((T([64, 128], f16, stride=(51328, 1)), [0], True), {})
cnt: 3, ((T([64, 256], f16), [0], True), {})
cnt: 24, ((T([12608, 256], f16), [0], True), {})
cnt: 3, ((T([64, 1, 256], f16), [0, 1], True), {})
cnt: 18, ((T([12608, 768], f16), [0], True), {})
cnt: 6, ((T([25664, 384], f16), [0], True), {})
cnt: 1, ((T([64, 197, 256], f16), [0], True), {})
cnt: 1, ((T([64, 1, 256], f16, stride=(50432, 256, 1)), [0], True), {})
cnt: 1, ((T([64, 401, 128], f16), [0], True), {})
cnt: 1, ((T([64, 1, 128], f16, stride=(51328, 128, 1)), [0], True), {})
Operator: aten.unbind.int
cnt: 3, ((T([3, 64, 4, 401, 32], f16, stride=(128, 153984, 32, 384, 1)),), {})
cnt: 9, ((T([3, 64, 4, 197, 64], f16, stride=(256, 151296, 64, 768, 1)),), {})
cnt: 1, ((T([2, 64, 1000], f16),), {})
Operator: aten.upsample_bicubic2d.vec
cnt: 1, ((T([64, 3, 240, 240], f16), [224, 224], False, None), {})

View File

@ -0,0 +1,177 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([64, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 67, ((T([], i64), 1), {})
cnt: 1, ((T([64, 64, 128, 128], f16), T([64, 64, 128, 128], f16, stride=(2097152, 16384, 128, 1))), {})
cnt: 1, ((T([64, 64, 64, 64], f16), T([64, 64, 64, 64], f16, stride=(524288, 4096, 64, 1))), {})
cnt: 3, ((T([64, 64, 64, 64], f16), T([64, 64, 64, 64], f16)), {})
cnt: 1, ((T([64, 128, 32, 32], f16), T([64, 128, 32, 32], f16, stride=(262144, 1024, 32, 1))), {})
cnt: 15, ((T([64, 128, 32, 32], f16), T([64, 128, 32, 32], f16)), {})
cnt: 1, ((T([64, 256, 16, 16], f16), T([64, 256, 16, 16], f16, stride=(131072, 256, 16, 1))), {})
cnt: 15, ((T([64, 256, 16, 16], f16), T([64, 256, 16, 16], f16)), {})
cnt: 1, ((T([64, 512, 8, 8], f16), T([64, 512, 8, 8], f16, stride=(65536, 64, 8, 1))), {})
cnt: 7, ((T([64, 512, 8, 8], f16), T([64, 512, 8, 8], f16)), {})
cnt: 1, ((T([64, 1024, 8, 8], f16), T([64, 1024, 8, 8], f16)), {})
cnt: 1, ((T([64, 512, 16, 16], f16), T([64, 512, 16, 16], f16)), {})
cnt: 1, ((T([64, 256, 32, 32], f16), T([64, 256, 32, 32], f16)), {})
cnt: 1, ((T([64, 128, 64, 64], f16), T([64, 128, 64, 64], f16)), {})
cnt: 1, ((T([64, 64, 128, 128], f16), T([64, 64, 128, 128], f16)), {})
cnt: 1, ((T([64, 128, 128, 128], f16), T([64, 128, 128, 128], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([64, 1024], f16), T([1024, 1000], f16, stride=(1, 1024))), {})
Operator: aten.cat.default
cnt: 1, (([T([64, 64, 128, 128], f16, stride=(2097152, 16384, 128, 1)), T([64, 64, 128, 128], f16)], 1), {})
cnt: 1, (([T([64, 64, 64, 64], f16, stride=(524288, 4096, 64, 1)), T([64, 64, 64, 64], f16)], 1), {})
cnt: 1, (([T([64, 128, 32, 32], f16, stride=(262144, 1024, 32, 1)), T([64, 128, 32, 32], f16)], 1), {})
cnt: 1, (([T([64, 256, 16, 16], f16, stride=(131072, 256, 16, 1)), T([64, 256, 16, 16], f16)], 1), {})
cnt: 1, (([T([64, 512, 8, 8], f16, stride=(65536, 64, 8, 1)), T([64, 512, 8, 8], f16)], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 3, 256, 256], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([64, 3, 256, 256], f16), T([32, 3, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 32, 256, 256], f16), T([64, 32, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 64, 128, 128], f16), T([128, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 64, 128, 128], f16, stride=(2097152, 16384, 128, 1)), T([32, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 32, 128, 128], f16), T([64, 32, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 64, 128, 128], f16), T([64, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 128, 128, 128], f16), T([64, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 64, 128, 128], f16), T([128, 64, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 128, 64, 64], f16), T([128, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 64, 64, 64], f16, stride=(524288, 4096, 64, 1)), T([64, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 64, 64, 64], f16), T([64, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 64, 64, 64], f16), T([64, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 128, 64, 64], f16), T([256, 128, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 256, 32, 32], f16), T([256, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 128, 32, 32], f16, stride=(262144, 1024, 32, 1)), T([128, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([64, 128, 32, 32], f16), T([128, 128, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([64, 128, 32, 32], f16), T([128, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 256, 32, 32], f16), T([512, 256, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 512, 16, 16], f16), T([512, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 256, 16, 16], f16, stride=(131072, 256, 16, 1)), T([256, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([64, 256, 16, 16], f16), T([256, 256, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([64, 256, 16, 16], f16), T([256, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 512, 16, 16], f16), T([1024, 512, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 1024, 8, 8], f16), T([1024, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 512, 8, 8], f16, stride=(65536, 64, 8, 1)), T([512, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([64, 512, 8, 8], f16), T([512, 512, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([64, 512, 8, 8], f16), T([512, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 2, ((T([64, 1024, 8, 8], f16), T([64, 1024, 8, 8], f16), T([1024, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([64, 512, 8, 8], f16), T([64, 512, 8, 8], f16), T([512, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([64, 512, 8, 8], f16), T([64, 512, 8, 8], f16), T([512, 512, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 512, 8, 8], f16), T([64, 512, 8, 8], f16, stride=(65536, 64, 8, 1)), T([512, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 1024, 8, 8], f16), T([64, 512, 16, 16], f16), T([1024, 512, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 512, 16, 16], f16), T([64, 512, 16, 16], f16), T([512, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 8, ((T([64, 256, 16, 16], f16), T([64, 256, 16, 16], f16), T([256, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 8, ((T([64, 256, 16, 16], f16), T([64, 256, 16, 16], f16), T([256, 256, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 256, 16, 16], f16), T([64, 256, 16, 16], f16, stride=(131072, 256, 16, 1)), T([256, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 512, 16, 16], f16), T([64, 256, 32, 32], f16), T([512, 256, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 256, 32, 32], f16), T([64, 256, 32, 32], f16), T([256, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 8, ((T([64, 128, 32, 32], f16), T([64, 128, 32, 32], f16), T([128, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 8, ((T([64, 128, 32, 32], f16), T([64, 128, 32, 32], f16), T([128, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 32, 32], f16), T([64, 128, 32, 32], f16, stride=(262144, 1024, 32, 1)), T([128, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 256, 32, 32], f16), T([64, 128, 64, 64], f16), T([256, 128, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 128, 64, 64], f16), T([64, 128, 64, 64], f16), T([128, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 64, 64, 64], f16), T([64, 64, 64, 64], f16), T([64, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 64, 64, 64], f16), T([64, 64, 64, 64], f16), T([64, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 64, 64], f16), T([64, 64, 64, 64], f16, stride=(524288, 4096, 64, 1)), T([64, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 64, 64], f16), T([64, 64, 128, 128], f16), T([128, 64, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 128, 128], f16), T([64, 128, 128, 128], f16), T([64, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 128, 128], f16), T([64, 64, 128, 128], f16), T([64, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 128, 128], f16), T([64, 32, 128, 128], f16), T([64, 32, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 32, 128, 128], f16), T([64, 64, 128, 128], f16, stride=(2097152, 16384, 128, 1)), T([32, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 128, 128], f16), T([64, 64, 128, 128], f16), T([128, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 128, 128], f16), T([64, 32, 256, 256], f16), T([64, 32, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 32, 256, 256], f16), T([64, 3, 256, 256], f16), T([32, 3, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 3, 256, 256], f16), T([64, 3, 256, 256], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([64, 1024, 8, 8], f16, stride=(1024, 1, 0, 0)), 64), {})
Operator: aten.leaky_relu_.default
cnt: 1, ((T([64, 32, 256, 256], f16),), {})
cnt: 4, ((T([64, 64, 128, 128], f16),), {})
cnt: 1, ((T([64, 128, 128, 128], f16),), {})
cnt: 1, ((T([64, 32, 128, 128], f16),), {})
cnt: 3, ((T([64, 128, 64, 64], f16),), {})
cnt: 5, ((T([64, 64, 64, 64], f16),), {})
cnt: 3, ((T([64, 256, 32, 32], f16),), {})
cnt: 17, ((T([64, 128, 32, 32], f16),), {})
cnt: 3, ((T([64, 512, 16, 16], f16),), {})
cnt: 17, ((T([64, 256, 16, 16], f16),), {})
cnt: 3, ((T([64, 1024, 8, 8], f16),), {})
cnt: 9, ((T([64, 512, 8, 8], f16),), {})
Operator: aten.leaky_relu_backward.default
cnt: 3, ((T([64, 1024, 8, 8], f16), T([64, 1024, 8, 8], f16), 0.01, True), {})
cnt: 1, ((T([64, 512, 8, 8], f16, stride=(65536, 64, 8, 1)), T([64, 512, 8, 8], f16), 0.01, True), {})
cnt: 8, ((T([64, 512, 8, 8], f16), T([64, 512, 8, 8], f16), 0.01, True), {})
cnt: 3, ((T([64, 512, 16, 16], f16), T([64, 512, 16, 16], f16), 0.01, True), {})
cnt: 1, ((T([64, 256, 16, 16], f16, stride=(131072, 256, 16, 1)), T([64, 256, 16, 16], f16), 0.01, True), {})
cnt: 16, ((T([64, 256, 16, 16], f16), T([64, 256, 16, 16], f16), 0.01, True), {})
cnt: 3, ((T([64, 256, 32, 32], f16), T([64, 256, 32, 32], f16), 0.01, True), {})
cnt: 1, ((T([64, 128, 32, 32], f16, stride=(262144, 1024, 32, 1)), T([64, 128, 32, 32], f16), 0.01, True), {})
cnt: 16, ((T([64, 128, 32, 32], f16), T([64, 128, 32, 32], f16), 0.01, True), {})
cnt: 3, ((T([64, 128, 64, 64], f16), T([64, 128, 64, 64], f16), 0.01, True), {})
cnt: 1, ((T([64, 64, 64, 64], f16, stride=(524288, 4096, 64, 1)), T([64, 64, 64, 64], f16), 0.01, True), {})
cnt: 4, ((T([64, 64, 64, 64], f16), T([64, 64, 64, 64], f16), 0.01, True), {})
cnt: 3, ((T([64, 64, 128, 128], f16), T([64, 64, 128, 128], f16), 0.01, True), {})
cnt: 1, ((T([64, 64, 128, 128], f16, stride=(2097152, 16384, 128, 1)), T([64, 64, 128, 128], f16), 0.01, True), {})
cnt: 1, ((T([64, 32, 128, 128], f16), T([64, 32, 128, 128], f16), 0.01, True), {})
cnt: 1, ((T([64, 128, 128, 128], f16), T([64, 128, 128, 128], f16), 0.01, True), {})
cnt: 1, ((T([64, 32, 256, 256], f16), T([64, 32, 256, 256], f16), 0.01, True), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([64], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([64, 1024, 8, 8], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([64, 1000], f16), T([1000, 1024], f16)), {})
cnt: 1, ((T([1000, 64], f16, stride=(1, 1000)), T([64, 1024], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([64, 32, 256, 256], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([64, 64, 128, 128], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 128, 128, 128], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 32, 128, 128], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([64, 128, 64, 64], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 5, ((T([64, 64, 64, 64], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([64, 256, 32, 32], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 17, ((T([64, 128, 32, 32], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([64, 512, 16, 16], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 17, ((T([64, 256, 16, 16], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([64, 1024, 8, 8], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
cnt: 9, ((T([64, 512, 8, 8], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 3, ((T([64, 1024, 8, 8], f16), T([64, 1024, 8, 8], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 9, ((T([64, 512, 8, 8], f16), T([64, 512, 8, 8], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([64, 512, 16, 16], f16), T([64, 512, 16, 16], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 17, ((T([64, 256, 16, 16], f16), T([64, 256, 16, 16], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([64, 256, 32, 32], f16), T([64, 256, 32, 32], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 17, ((T([64, 128, 32, 32], f16), T([64, 128, 32, 32], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([64, 128, 64, 64], f16), T([64, 128, 64, 64], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([64, 64, 64, 64], f16), T([64, 64, 64, 64], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([64, 64, 128, 128], f16), T([64, 64, 128, 128], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 32, 128, 128], f16), T([64, 32, 128, 128], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 128, 128, 128], f16), T([64, 128, 128, 128], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 32, 256, 256], f16), T([64, 32, 256, 256], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([64, 1000], f16), T([64], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([64, 1000], f16), T([64], i64), None, 1, -100), {})
Operator: aten.slice_backward.default
cnt: 1, ((T([64, 512, 8, 8], f16), [64, 1024, 8, 8], 1, 512, 9223372036854775807, 1), {})
cnt: 2, ((T([64, 1024, 8, 8], f16), [64, 1024, 8, 8], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([64, 512, 8, 8], f16, stride=(65536, 64, 8, 1)), [64, 1024, 8, 8], 1, 0, 512, 1), {})
cnt: 1, ((T([64, 256, 16, 16], f16), [64, 512, 16, 16], 1, 256, 9223372036854775807, 1), {})
cnt: 2, ((T([64, 512, 16, 16], f16), [64, 512, 16, 16], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([64, 256, 16, 16], f16, stride=(131072, 256, 16, 1)), [64, 512, 16, 16], 1, 0, 256, 1), {})
cnt: 1, ((T([64, 128, 32, 32], f16), [64, 256, 32, 32], 1, 128, 9223372036854775807, 1), {})
cnt: 2, ((T([64, 256, 32, 32], f16), [64, 256, 32, 32], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([64, 128, 32, 32], f16, stride=(262144, 1024, 32, 1)), [64, 256, 32, 32], 1, 0, 128, 1), {})
cnt: 1, ((T([64, 64, 64, 64], f16), [64, 128, 64, 64], 1, 64, 9223372036854775807, 1), {})
cnt: 2, ((T([64, 128, 64, 64], f16), [64, 128, 64, 64], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([64, 64, 64, 64], f16, stride=(524288, 4096, 64, 1)), [64, 128, 64, 64], 1, 0, 64, 1), {})
cnt: 1, ((T([64, 64, 128, 128], f16), [64, 128, 128, 128], 1, 64, 9223372036854775807, 1), {})
cnt: 2, ((T([64, 128, 128, 128], f16), [64, 128, 128, 128], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([64, 64, 128, 128], f16, stride=(2097152, 16384, 128, 1)), [64, 128, 128, 128], 1, 0, 64, 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([64, 1000], f16), [0], True), {})

View File

@ -0,0 +1,87 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([64, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 12, ((T([64, 12, 198, 198], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 12, ((T([64, 12, 198, 198], f16), T([64, 12, 198, 198], f16), -1, f16), {})
Operator: aten._unsafe_view.default
cnt: 36, ((T([64, 12, 198, 64], f16), [768, 198, 64]), {})
cnt: 12, ((T([64, 12, 64, 198], f16), [768, 64, 198]), {})
cnt: 12, ((T([768, 198, 198], f16), [64, 12, 198, 198]), {})
cnt: 12, ((T([768, 198, 64], f16), [64, 12, 198, 64]), {})
cnt: 12, ((T([64, 198, 12, 64], f16), [64, 198, 768]), {})
cnt: 12, ((T([64, 198, 3, 12, 64], f16), [64, 198, 2304]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([64, 198, 768], f16), T([1, 198, 768], f16)), {})
cnt: 49, ((T([64, 198, 768], f16), T([64, 198, 768], f16)), {})
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16)), {})
Operator: aten.addmm.default
cnt: 12, ((T([2304], f16), T([12672, 768], f16), T([768, 2304], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([12672, 768], f16), T([768, 768], f16, stride=(1, 768))), {})
cnt: 12, ((T([3072], f16), T([12672, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([12672, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 2, ((T([1000], f16), T([64, 768], f16, stride=(152064, 1)), T([768, 1000], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 12, ((T([768, 198, 64], f16), T([768, 64, 198], f16)), {})
cnt: 12, ((T([768, 198, 198], f16), T([768, 198, 64], f16)), {})
cnt: 12, ((T([768, 198, 198], f16, stride=(39204, 1, 198)), T([768, 198, 64], f16)), {})
cnt: 12, ((T([768, 198, 64], f16), T([768, 64, 198], f16, stride=(12672, 1, 64))), {})
cnt: 12, ((T([768, 64, 198], f16, stride=(12672, 1, 64)), T([768, 198, 198], f16)), {})
cnt: 12, ((T([768, 198, 198], f16), T([768, 198, 64], f16, stride=(12672, 1, 198))), {})
Operator: aten.cat.default
cnt: 1, (([T([64, 1, 768], f16, stride=(0, 768, 1)), T([64, 1, 768], f16, stride=(0, 768, 1)), T([64, 196, 768], f16, stride=(150528, 1, 196))], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([768, 3, 16, 16], f16), T([768], f16), [16, 16], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([64, 768, 14, 14], f16, stride=(152064, 1, 10752, 768)), T([64, 3, 224, 224], f16), T([768, 3, 16, 16], f16), [768], [16, 16], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([64, 3, 224, 224], f16)), {})
Operator: aten.div.Tensor
cnt: 2, ((T([64, 1000], f16), 2), {})
Operator: aten.gelu.default
cnt: 12, ((T([64, 198, 3072], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([64, 198, 3072], f16), T([64, 198, 3072], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([64], i64),), {})
Operator: aten.mm.default
cnt: 2, ((T([64, 1000], f16), T([1000, 768], f16)), {})
cnt: 2, ((T([1000, 64], f16, stride=(1, 1000)), T([64, 768], f16, stride=(152064, 1))), {})
cnt: 12, ((T([12672, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 12672], f16, stride=(1, 768)), T([12672, 3072], f16)), {})
cnt: 12, ((T([12672, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 12672], f16, stride=(1, 3072)), T([12672, 768], f16)), {})
cnt: 12, ((T([12672, 768], f16), T([768, 768], f16)), {})
cnt: 12, ((T([768, 12672], f16, stride=(1, 768)), T([12672, 768], f16)), {})
cnt: 12, ((T([12672, 2304], f16), T([2304, 768], f16)), {})
cnt: 12, ((T([2304, 12672], f16, stride=(1, 2304)), T([12672, 768], f16)), {})
Operator: aten.mul.Tensor
cnt: 24, ((T([64, 12, 198, 198], f16), 0.125), {})
Operator: aten.native_layer_norm.default
cnt: 25, ((T([64, 198, 768], f16), [768], T([768], f16), T([768], f16), 1e-06), {})
Operator: aten.native_layer_norm_backward.default
cnt: 25, ((T([64, 198, 768], f16), T([64, 198, 768], f16), [768], T([64, 198, 1], f32), T([64, 198, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([64, 1000], f16), T([64], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([64, 1000], f16), T([64], i64), None, 1, -100), {})
Operator: aten.select_backward.default
cnt: 1, ((T([64, 768], f16), [64, 198, 768], 1, 1), {})
cnt: 1, ((T([64, 768], f16), [64, 198, 768], 1, 0), {})
Operator: aten.slice_backward.default
cnt: 2, ((T([64, 198, 768], f16), [64, 198, 768], 0, 0, 9223372036854775807, 1), {})
Operator: aten.stack.default
cnt: 12, (([T([64, 12, 198, 64], f16), T([64, 12, 198, 64], f16, stride=(152064, 12672, 1, 198)), T([64, 12, 198, 64], f16)],), {})
Operator: aten.sum.SymInt
cnt: 2, ((T([64, 1000], f16), [0], True), {})
cnt: 24, ((T([12672, 768], f16), [0], True), {})
cnt: 12, ((T([12672, 3072], f16), [0], True), {})
cnt: 12, ((T([12672, 2304], f16), [0], True), {})
cnt: 1, ((T([64, 198, 768], f16), [0], True), {})
cnt: 2, ((T([64, 1, 768], f16, stride=(152064, 768, 1)), [0], True), {})
Operator: aten.unbind.int
cnt: 12, ((T([3, 64, 12, 198, 64], f16, stride=(768, 456192, 64, 2304, 1)),), {})

View File

@ -0,0 +1,616 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([64, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 121, ((T([], i64), 1), {})
cnt: 1, ((T([64, 512, 7, 7], f16, stride=(50176, 49, 7, 1)), T([64, 512, 7, 7], f16, stride=(48608, 49, 7, 1))), {})
cnt: 15, ((T([64, 32, 7, 7], f16, stride=(50176, 49, 7, 1)), T([64, 32, 7, 7], f16, stride=(48608, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(47040, 49, 7, 1))), {})
cnt: 14, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(47040, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(45472, 49, 7, 1))), {})
cnt: 13, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(45472, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(43904, 49, 7, 1))), {})
cnt: 12, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(43904, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(42336, 49, 7, 1))), {})
cnt: 11, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(42336, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(40768, 49, 7, 1))), {})
cnt: 10, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(40768, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(39200, 49, 7, 1))), {})
cnt: 9, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(39200, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(37632, 49, 7, 1))), {})
cnt: 8, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(37632, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(36064, 49, 7, 1))), {})
cnt: 7, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(36064, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(34496, 49, 7, 1))), {})
cnt: 6, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(34496, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(32928, 49, 7, 1))), {})
cnt: 5, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(32928, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(31360, 49, 7, 1))), {})
cnt: 4, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(31360, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(29792, 49, 7, 1))), {})
cnt: 3, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(29792, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(28224, 49, 7, 1))), {})
cnt: 2, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(28224, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16, stride=(26656, 49, 7, 1))), {})
cnt: 1, ((T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16, stride=(26656, 49, 7, 1))), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16)), {})
cnt: 1, ((T([64, 256, 14, 14], f16, stride=(200704, 196, 14, 1)), T([64, 256, 14, 14], f16, stride=(194432, 196, 14, 1))), {})
cnt: 23, ((T([64, 32, 14, 14], f16, stride=(200704, 196, 14, 1)), T([64, 32, 14, 14], f16, stride=(194432, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(188160, 196, 14, 1))), {})
cnt: 22, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(188160, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(181888, 196, 14, 1))), {})
cnt: 21, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(181888, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(175616, 196, 14, 1))), {})
cnt: 20, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(175616, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(169344, 196, 14, 1))), {})
cnt: 19, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(169344, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(163072, 196, 14, 1))), {})
cnt: 18, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(163072, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(156800, 196, 14, 1))), {})
cnt: 17, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(156800, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(150528, 196, 14, 1))), {})
cnt: 16, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(150528, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(144256, 196, 14, 1))), {})
cnt: 15, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(144256, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(137984, 196, 14, 1))), {})
cnt: 14, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(137984, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(131712, 196, 14, 1))), {})
cnt: 13, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(131712, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(125440, 196, 14, 1))), {})
cnt: 12, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(125440, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(119168, 196, 14, 1))), {})
cnt: 11, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(119168, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(112896, 196, 14, 1))), {})
cnt: 10, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(112896, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(106624, 196, 14, 1))), {})
cnt: 9, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(106624, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(100352, 196, 14, 1))), {})
cnt: 8, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(100352, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(94080, 196, 14, 1))), {})
cnt: 7, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(94080, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(87808, 196, 14, 1))), {})
cnt: 6, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(87808, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(81536, 196, 14, 1))), {})
cnt: 5, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(81536, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(75264, 196, 14, 1))), {})
cnt: 4, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(75264, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(68992, 196, 14, 1))), {})
cnt: 3, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(68992, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(62720, 196, 14, 1))), {})
cnt: 2, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(62720, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16, stride=(56448, 196, 14, 1))), {})
cnt: 1, ((T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16, stride=(56448, 196, 14, 1))), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16)), {})
cnt: 1, ((T([64, 128, 28, 28], f16, stride=(401408, 784, 28, 1)), T([64, 128, 28, 28], f16, stride=(376320, 784, 28, 1))), {})
cnt: 11, ((T([64, 32, 28, 28], f16, stride=(401408, 784, 28, 1)), T([64, 32, 28, 28], f16, stride=(376320, 784, 28, 1))), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16, stride=(351232, 784, 28, 1))), {})
cnt: 10, ((T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16, stride=(351232, 784, 28, 1))), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16, stride=(326144, 784, 28, 1))), {})
cnt: 9, ((T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16, stride=(326144, 784, 28, 1))), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16, stride=(301056, 784, 28, 1))), {})
cnt: 8, ((T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16, stride=(301056, 784, 28, 1))), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16, stride=(275968, 784, 28, 1))), {})
cnt: 7, ((T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16, stride=(275968, 784, 28, 1))), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16, stride=(250880, 784, 28, 1))), {})
cnt: 6, ((T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16, stride=(250880, 784, 28, 1))), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16, stride=(225792, 784, 28, 1))), {})
cnt: 5, ((T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16, stride=(225792, 784, 28, 1))), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16, stride=(200704, 784, 28, 1))), {})
cnt: 4, ((T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16, stride=(200704, 784, 28, 1))), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16, stride=(175616, 784, 28, 1))), {})
cnt: 3, ((T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16, stride=(175616, 784, 28, 1))), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16, stride=(150528, 784, 28, 1))), {})
cnt: 2, ((T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16, stride=(150528, 784, 28, 1))), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16, stride=(125440, 784, 28, 1))), {})
cnt: 1, ((T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16, stride=(125440, 784, 28, 1))), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16)), {})
cnt: 1, ((T([64, 64, 56, 56], f16, stride=(802816, 3136, 56, 1)), T([64, 64, 56, 56], f16, stride=(702464, 3136, 56, 1))), {})
cnt: 5, ((T([64, 32, 56, 56], f16, stride=(802816, 3136, 56, 1)), T([64, 32, 56, 56], f16, stride=(702464, 3136, 56, 1))), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16, stride=(602112, 3136, 56, 1))), {})
cnt: 4, ((T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16, stride=(602112, 3136, 56, 1))), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16, stride=(501760, 3136, 56, 1))), {})
cnt: 3, ((T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16, stride=(501760, 3136, 56, 1))), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16, stride=(401408, 3136, 56, 1))), {})
cnt: 2, ((T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16, stride=(401408, 3136, 56, 1))), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16, stride=(301056, 3136, 56, 1))), {})
cnt: 1, ((T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16, stride=(301056, 3136, 56, 1))), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([64, 1024], f16), T([1024, 1000], f16, stride=(1, 1024))), {})
Operator: aten.avg_pool2d.default
cnt: 1, ((T([64, 128, 56, 56], f16), [2, 2], [2, 2]), {})
cnt: 1, ((T([64, 256, 28, 28], f16), [2, 2], [2, 2]), {})
cnt: 1, ((T([64, 512, 14, 14], f16), [2, 2], [2, 2]), {})
Operator: aten.avg_pool2d_backward.default
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 14, 14], f16), [2, 2], [2, 2], [0, 0], False, True, None), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 28, 28], f16), [2, 2], [2, 2], [0, 0], False, True, None), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 56, 56], f16), [2, 2], [2, 2], [0, 0], False, True, None), {})
Operator: aten.cat.default
cnt: 1, (([T([64, 64, 56, 56], f16)], 1), {})
cnt: 1, (([T([64, 64, 56, 56], f16), T([64, 32, 56, 56], f16)], 1), {})
cnt: 1, (([T([64, 64, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16)], 1), {})
cnt: 1, (([T([64, 64, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16)], 1), {})
cnt: 1, (([T([64, 64, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16)], 1), {})
cnt: 1, (([T([64, 64, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16)], 1), {})
cnt: 1, (([T([64, 64, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16), T([64, 32, 56, 56], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16), T([64, 32, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 128, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16), T([64, 32, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 256, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16), T([64, 32, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 512, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16), T([64, 32, 7, 7], f16)], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([64, 3, 7, 7], f16), None, [2, 2], [3, 3], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([128, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 6, ((T([64, 128, 56, 56], f16), T([32, 128, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 96, 56, 56], f16), T([128, 96, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([128, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 160, 56, 56], f16), T([128, 160, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 192, 56, 56], f16), T([128, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 224, 56, 56], f16), T([128, 224, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 256, 56, 56], f16), T([128, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([128, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 12, ((T([64, 128, 28, 28], f16), T([32, 128, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 160, 28, 28], f16), T([128, 160, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 192, 28, 28], f16), T([128, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 224, 28, 28], f16), T([128, 224, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([128, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 288, 28, 28], f16), T([128, 288, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 320, 28, 28], f16), T([128, 320, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 352, 28, 28], f16), T([128, 352, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 384, 28, 28], f16), T([128, 384, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 416, 28, 28], f16), T([128, 416, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 448, 28, 28], f16), T([128, 448, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 480, 28, 28], f16), T([128, 480, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 512, 28, 28], f16), T([256, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([128, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 24, ((T([64, 128, 14, 14], f16), T([32, 128, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 288, 14, 14], f16), T([128, 288, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 320, 14, 14], f16), T([128, 320, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 352, 14, 14], f16), T([128, 352, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 384, 14, 14], f16), T([128, 384, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 416, 14, 14], f16), T([128, 416, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 448, 14, 14], f16), T([128, 448, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 480, 14, 14], f16), T([128, 480, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([128, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 544, 14, 14], f16), T([128, 544, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 576, 14, 14], f16), T([128, 576, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 608, 14, 14], f16), T([128, 608, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 640, 14, 14], f16), T([128, 640, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 672, 14, 14], f16), T([128, 672, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 704, 14, 14], f16), T([128, 704, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 736, 14, 14], f16), T([128, 736, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 768, 14, 14], f16), T([128, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 800, 14, 14], f16), T([128, 800, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 832, 14, 14], f16), T([128, 832, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 864, 14, 14], f16), T([128, 864, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 896, 14, 14], f16), T([128, 896, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 928, 14, 14], f16), T([128, 928, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 960, 14, 14], f16), T([128, 960, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 992, 14, 14], f16), T([128, 992, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 1024, 14, 14], f16), T([512, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([128, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 16, ((T([64, 128, 7, 7], f16), T([32, 128, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 544, 7, 7], f16), T([128, 544, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 576, 7, 7], f16), T([128, 576, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 608, 7, 7], f16), T([128, 608, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 640, 7, 7], f16), T([128, 640, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 672, 7, 7], f16), T([128, 672, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 704, 7, 7], f16), T([128, 704, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 736, 7, 7], f16), T([128, 736, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 768, 7, 7], f16), T([128, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 800, 7, 7], f16), T([128, 800, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 832, 7, 7], f16), T([128, 832, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 864, 7, 7], f16), T([128, 864, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 896, 7, 7], f16), T([128, 896, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 928, 7, 7], f16), T([128, 928, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 960, 7, 7], f16), T([128, 960, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 992, 7, 7], f16), T([128, 992, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([64, 32, 7, 7], f16, stride=(50176, 49, 7, 1)), T([64, 128, 7, 7], f16), T([32, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 992, 7, 7], f16), T([128, 992, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 15, ((T([64, 32, 7, 7], f16), T([64, 128, 7, 7], f16), T([32, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 960, 7, 7], f16), T([128, 960, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 928, 7, 7], f16), T([128, 928, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 896, 7, 7], f16), T([128, 896, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 864, 7, 7], f16), T([128, 864, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 832, 7, 7], f16), T([128, 832, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 800, 7, 7], f16), T([128, 800, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 768, 7, 7], f16), T([128, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 736, 7, 7], f16), T([128, 736, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 704, 7, 7], f16), T([128, 704, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 672, 7, 7], f16), T([128, 672, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 640, 7, 7], f16), T([128, 640, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 608, 7, 7], f16), T([128, 608, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 576, 7, 7], f16), T([128, 576, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 544, 7, 7], f16), T([128, 544, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 7, 7], f16), T([64, 512, 7, 7], f16), T([128, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 1024, 14, 14], f16), T([512, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 32, 14, 14], f16, stride=(200704, 196, 14, 1)), T([64, 128, 14, 14], f16), T([32, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 992, 14, 14], f16), T([128, 992, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 23, ((T([64, 32, 14, 14], f16), T([64, 128, 14, 14], f16), T([32, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 960, 14, 14], f16), T([128, 960, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 928, 14, 14], f16), T([128, 928, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 896, 14, 14], f16), T([128, 896, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 864, 14, 14], f16), T([128, 864, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 832, 14, 14], f16), T([128, 832, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 800, 14, 14], f16), T([128, 800, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 768, 14, 14], f16), T([128, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 736, 14, 14], f16), T([128, 736, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 704, 14, 14], f16), T([128, 704, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 672, 14, 14], f16), T([128, 672, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 640, 14, 14], f16), T([128, 640, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 608, 14, 14], f16), T([128, 608, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 576, 14, 14], f16), T([128, 576, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 544, 14, 14], f16), T([128, 544, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 512, 14, 14], f16), T([128, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 480, 14, 14], f16), T([128, 480, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 448, 14, 14], f16), T([128, 448, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 416, 14, 14], f16), T([128, 416, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 384, 14, 14], f16), T([128, 384, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 352, 14, 14], f16), T([128, 352, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 320, 14, 14], f16), T([128, 320, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 288, 14, 14], f16), T([128, 288, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 14, 14], f16), T([64, 256, 14, 14], f16), T([128, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([64, 512, 28, 28], f16), T([256, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 32, 28, 28], f16, stride=(401408, 784, 28, 1)), T([64, 128, 28, 28], f16), T([32, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 480, 28, 28], f16), T([128, 480, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 11, ((T([64, 32, 28, 28], f16), T([64, 128, 28, 28], f16), T([32, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 448, 28, 28], f16), T([128, 448, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 416, 28, 28], f16), T([128, 416, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 384, 28, 28], f16), T([128, 384, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 352, 28, 28], f16), T([128, 352, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 320, 28, 28], f16), T([128, 320, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 288, 28, 28], f16), T([128, 288, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 256, 28, 28], f16), T([128, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 224, 28, 28], f16), T([128, 224, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 192, 28, 28], f16), T([128, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 160, 28, 28], f16), T([128, 160, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16), T([128, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 256, 56, 56], f16), T([128, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 32, 56, 56], f16, stride=(802816, 3136, 56, 1)), T([64, 128, 56, 56], f16), T([32, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 224, 56, 56], f16), T([128, 224, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 5, ((T([64, 32, 56, 56], f16), T([64, 128, 56, 56], f16), T([32, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 192, 56, 56], f16), T([128, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 160, 56, 56], f16), T([128, 160, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 128, 56, 56], f16), T([128, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 96, 56, 56], f16), T([128, 96, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 64, 56, 56], f16), T([128, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 112, 112], f16), T([64, 3, 224, 224], f16), T([64, 3, 7, 7], f16), [0], [2, 2], [3, 3], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([64, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([64, 1024, 7, 7], f16, stride=(1024, 1, 0, 0)), 49), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([64], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([64, 64, 112, 112], f16), [3, 3], [2, 2], [1, 1]), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 112, 112], f16), [3, 3], [2, 2], [1, 1], [1, 1], False, T([64, 64, 56, 56], i64)), {})
Operator: aten.mean.dim
cnt: 1, ((T([64, 1024, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([64, 1000], f16), T([1000, 1024], f16)), {})
cnt: 1, ((T([1000, 64], f16, stride=(1, 1000)), T([64, 1024], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([64, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 7, ((T([64, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 96, 56, 56], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 160, 56, 56], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 192, 56, 56], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 224, 56, 56], f16), T([224], f16), T([224], f16), T([224], f16), T([224], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 256, 56, 56], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 13, ((T([64, 128, 28, 28], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 160, 28, 28], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 192, 28, 28], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 224, 28, 28], f16), T([224], f16), T([224], f16), T([224], f16), T([224], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 288, 28, 28], f16), T([288], f16), T([288], f16), T([288], f16), T([288], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 320, 28, 28], f16), T([320], f16), T([320], f16), T([320], f16), T([320], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 352, 28, 28], f16), T([352], f16), T([352], f16), T([352], f16), T([352], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 384, 28, 28], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 416, 28, 28], f16), T([416], f16), T([416], f16), T([416], f16), T([416], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 448, 28, 28], f16), T([448], f16), T([448], f16), T([448], f16), T([448], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 480, 28, 28], f16), T([480], f16), T([480], f16), T([480], f16), T([480], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 512, 28, 28], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 24, ((T([64, 128, 14, 14], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 288, 14, 14], f16), T([288], f16), T([288], f16), T([288], f16), T([288], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 320, 14, 14], f16), T([320], f16), T([320], f16), T([320], f16), T([320], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 352, 14, 14], f16), T([352], f16), T([352], f16), T([352], f16), T([352], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 384, 14, 14], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 416, 14, 14], f16), T([416], f16), T([416], f16), T([416], f16), T([416], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 448, 14, 14], f16), T([448], f16), T([448], f16), T([448], f16), T([448], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 480, 14, 14], f16), T([480], f16), T([480], f16), T([480], f16), T([480], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 544, 14, 14], f16), T([544], f16), T([544], f16), T([544], f16), T([544], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 576, 14, 14], f16), T([576], f16), T([576], f16), T([576], f16), T([576], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 608, 14, 14], f16), T([608], f16), T([608], f16), T([608], f16), T([608], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 640, 14, 14], f16), T([640], f16), T([640], f16), T([640], f16), T([640], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 672, 14, 14], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 704, 14, 14], f16), T([704], f16), T([704], f16), T([704], f16), T([704], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 736, 14, 14], f16), T([736], f16), T([736], f16), T([736], f16), T([736], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 768, 14, 14], f16), T([768], f16), T([768], f16), T([768], f16), T([768], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 800, 14, 14], f16), T([800], f16), T([800], f16), T([800], f16), T([800], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 832, 14, 14], f16), T([832], f16), T([832], f16), T([832], f16), T([832], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 864, 14, 14], f16), T([864], f16), T([864], f16), T([864], f16), T([864], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 896, 14, 14], f16), T([896], f16), T([896], f16), T([896], f16), T([896], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 928, 14, 14], f16), T([928], f16), T([928], f16), T([928], f16), T([928], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 960, 14, 14], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 992, 14, 14], f16), T([992], f16), T([992], f16), T([992], f16), T([992], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 1024, 14, 14], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 16, ((T([64, 128, 7, 7], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 544, 7, 7], f16), T([544], f16), T([544], f16), T([544], f16), T([544], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 576, 7, 7], f16), T([576], f16), T([576], f16), T([576], f16), T([576], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 608, 7, 7], f16), T([608], f16), T([608], f16), T([608], f16), T([608], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 640, 7, 7], f16), T([640], f16), T([640], f16), T([640], f16), T([640], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 672, 7, 7], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 704, 7, 7], f16), T([704], f16), T([704], f16), T([704], f16), T([704], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 736, 7, 7], f16), T([736], f16), T([736], f16), T([736], f16), T([736], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 768, 7, 7], f16), T([768], f16), T([768], f16), T([768], f16), T([768], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 800, 7, 7], f16), T([800], f16), T([800], f16), T([800], f16), T([800], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 832, 7, 7], f16), T([832], f16), T([832], f16), T([832], f16), T([832], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 864, 7, 7], f16), T([864], f16), T([864], f16), T([864], f16), T([864], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 896, 7, 7], f16), T([896], f16), T([896], f16), T([896], f16), T([896], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 928, 7, 7], f16), T([928], f16), T([928], f16), T([928], f16), T([928], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 960, 7, 7], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 992, 7, 7], f16), T([992], f16), T([992], f16), T([992], f16), T([992], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 1024, 7, 7], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([64, 1024, 7, 7], f16), T([64, 1024, 7, 7], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 16, ((T([64, 128, 7, 7], f16), T([64, 128, 7, 7], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 992, 7, 7], f16), T([64, 992, 7, 7], f16), T([992], f16), T([992], f16), T([992], f16), T([992], f32), T([992], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 960, 7, 7], f16), T([64, 960, 7, 7], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f32), T([960], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 928, 7, 7], f16), T([64, 928, 7, 7], f16), T([928], f16), T([928], f16), T([928], f16), T([928], f32), T([928], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 896, 7, 7], f16), T([64, 896, 7, 7], f16), T([896], f16), T([896], f16), T([896], f16), T([896], f32), T([896], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 864, 7, 7], f16), T([64, 864, 7, 7], f16), T([864], f16), T([864], f16), T([864], f16), T([864], f32), T([864], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 832, 7, 7], f16), T([64, 832, 7, 7], f16), T([832], f16), T([832], f16), T([832], f16), T([832], f32), T([832], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 800, 7, 7], f16), T([64, 800, 7, 7], f16), T([800], f16), T([800], f16), T([800], f16), T([800], f32), T([800], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 768, 7, 7], f16), T([64, 768, 7, 7], f16), T([768], f16), T([768], f16), T([768], f16), T([768], f32), T([768], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 736, 7, 7], f16), T([64, 736, 7, 7], f16), T([736], f16), T([736], f16), T([736], f16), T([736], f32), T([736], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 704, 7, 7], f16), T([64, 704, 7, 7], f16), T([704], f16), T([704], f16), T([704], f16), T([704], f32), T([704], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 672, 7, 7], f16), T([64, 672, 7, 7], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f32), T([672], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 640, 7, 7], f16), T([64, 640, 7, 7], f16), T([640], f16), T([640], f16), T([640], f16), T([640], f32), T([640], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 608, 7, 7], f16), T([64, 608, 7, 7], f16), T([608], f16), T([608], f16), T([608], f16), T([608], f32), T([608], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 576, 7, 7], f16), T([64, 576, 7, 7], f16), T([576], f16), T([576], f16), T([576], f16), T([576], f32), T([576], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 544, 7, 7], f16), T([64, 544, 7, 7], f16), T([544], f16), T([544], f16), T([544], f16), T([544], f32), T([544], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 1024, 14, 14], f16), T([64, 1024, 14, 14], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 24, ((T([64, 128, 14, 14], f16), T([64, 128, 14, 14], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 992, 14, 14], f16), T([64, 992, 14, 14], f16), T([992], f16), T([992], f16), T([992], f16), T([992], f32), T([992], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 960, 14, 14], f16), T([64, 960, 14, 14], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f32), T([960], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 928, 14, 14], f16), T([64, 928, 14, 14], f16), T([928], f16), T([928], f16), T([928], f16), T([928], f32), T([928], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 896, 14, 14], f16), T([64, 896, 14, 14], f16), T([896], f16), T([896], f16), T([896], f16), T([896], f32), T([896], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 864, 14, 14], f16), T([64, 864, 14, 14], f16), T([864], f16), T([864], f16), T([864], f16), T([864], f32), T([864], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 832, 14, 14], f16), T([64, 832, 14, 14], f16), T([832], f16), T([832], f16), T([832], f16), T([832], f32), T([832], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 800, 14, 14], f16), T([64, 800, 14, 14], f16), T([800], f16), T([800], f16), T([800], f16), T([800], f32), T([800], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 768, 14, 14], f16), T([64, 768, 14, 14], f16), T([768], f16), T([768], f16), T([768], f16), T([768], f32), T([768], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 736, 14, 14], f16), T([64, 736, 14, 14], f16), T([736], f16), T([736], f16), T([736], f16), T([736], f32), T([736], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 704, 14, 14], f16), T([64, 704, 14, 14], f16), T([704], f16), T([704], f16), T([704], f16), T([704], f32), T([704], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 672, 14, 14], f16), T([64, 672, 14, 14], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f32), T([672], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 640, 14, 14], f16), T([64, 640, 14, 14], f16), T([640], f16), T([640], f16), T([640], f16), T([640], f32), T([640], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 608, 14, 14], f16), T([64, 608, 14, 14], f16), T([608], f16), T([608], f16), T([608], f16), T([608], f32), T([608], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 576, 14, 14], f16), T([64, 576, 14, 14], f16), T([576], f16), T([576], f16), T([576], f16), T([576], f32), T([576], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 544, 14, 14], f16), T([64, 544, 14, 14], f16), T([544], f16), T([544], f16), T([544], f16), T([544], f32), T([544], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 480, 14, 14], f16), T([64, 480, 14, 14], f16), T([480], f16), T([480], f16), T([480], f16), T([480], f32), T([480], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 448, 14, 14], f16), T([64, 448, 14, 14], f16), T([448], f16), T([448], f16), T([448], f16), T([448], f32), T([448], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 416, 14, 14], f16), T([64, 416, 14, 14], f16), T([416], f16), T([416], f16), T([416], f16), T([416], f32), T([416], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 384, 14, 14], f16), T([64, 384, 14, 14], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f32), T([384], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 352, 14, 14], f16), T([64, 352, 14, 14], f16), T([352], f16), T([352], f16), T([352], f16), T([352], f32), T([352], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 320, 14, 14], f16), T([64, 320, 14, 14], f16), T([320], f16), T([320], f16), T([320], f16), T([320], f32), T([320], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 288, 14, 14], f16), T([64, 288, 14, 14], f16), T([288], f16), T([288], f16), T([288], f16), T([288], f32), T([288], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 512, 28, 28], f16), T([64, 512, 28, 28], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 13, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 480, 28, 28], f16), T([64, 480, 28, 28], f16), T([480], f16), T([480], f16), T([480], f16), T([480], f32), T([480], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 448, 28, 28], f16), T([64, 448, 28, 28], f16), T([448], f16), T([448], f16), T([448], f16), T([448], f32), T([448], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 416, 28, 28], f16), T([64, 416, 28, 28], f16), T([416], f16), T([416], f16), T([416], f16), T([416], f32), T([416], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 384, 28, 28], f16), T([64, 384, 28, 28], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f32), T([384], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 352, 28, 28], f16), T([64, 352, 28, 28], f16), T([352], f16), T([352], f16), T([352], f16), T([352], f32), T([352], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 320, 28, 28], f16), T([64, 320, 28, 28], f16), T([320], f16), T([320], f16), T([320], f16), T([320], f32), T([320], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 288, 28, 28], f16), T([64, 288, 28, 28], f16), T([288], f16), T([288], f16), T([288], f16), T([288], f32), T([288], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 224, 28, 28], f16), T([64, 224, 28, 28], f16), T([224], f16), T([224], f16), T([224], f16), T([224], f32), T([224], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 192, 28, 28], f16), T([64, 192, 28, 28], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 160, 28, 28], f16), T([64, 160, 28, 28], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f32), T([160], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 256, 56, 56], f16), T([64, 256, 56, 56], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 7, ((T([64, 128, 56, 56], f16), T([64, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 224, 56, 56], f16), T([64, 224, 56, 56], f16), T([224], f16), T([224], f16), T([224], f16), T([224], f32), T([224], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 192, 56, 56], f16), T([64, 192, 56, 56], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 160, 56, 56], f16), T([64, 160, 56, 56], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f32), T([160], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 96, 56, 56], f16), T([64, 96, 56, 56], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f32), T([96], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 64, 112, 112], f16), T([64, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([64, 1000], f16), T([64], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([64, 1000], f16), T([64], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 1, ((T([64, 64, 112, 112], f16),), {})
cnt: 1, ((T([64, 64, 56, 56], f16),), {})
cnt: 7, ((T([64, 128, 56, 56], f16),), {})
cnt: 1, ((T([64, 96, 56, 56], f16),), {})
cnt: 1, ((T([64, 160, 56, 56], f16),), {})
cnt: 1, ((T([64, 192, 56, 56], f16),), {})
cnt: 1, ((T([64, 224, 56, 56], f16),), {})
cnt: 1, ((T([64, 256, 56, 56], f16),), {})
cnt: 13, ((T([64, 128, 28, 28], f16),), {})
cnt: 1, ((T([64, 160, 28, 28], f16),), {})
cnt: 1, ((T([64, 192, 28, 28], f16),), {})
cnt: 1, ((T([64, 224, 28, 28], f16),), {})
cnt: 1, ((T([64, 256, 28, 28], f16),), {})
cnt: 1, ((T([64, 288, 28, 28], f16),), {})
cnt: 1, ((T([64, 320, 28, 28], f16),), {})
cnt: 1, ((T([64, 352, 28, 28], f16),), {})
cnt: 1, ((T([64, 384, 28, 28], f16),), {})
cnt: 1, ((T([64, 416, 28, 28], f16),), {})
cnt: 1, ((T([64, 448, 28, 28], f16),), {})
cnt: 1, ((T([64, 480, 28, 28], f16),), {})
cnt: 1, ((T([64, 512, 28, 28], f16),), {})
cnt: 1, ((T([64, 256, 14, 14], f16),), {})
cnt: 24, ((T([64, 128, 14, 14], f16),), {})
cnt: 1, ((T([64, 288, 14, 14], f16),), {})
cnt: 1, ((T([64, 320, 14, 14], f16),), {})
cnt: 1, ((T([64, 352, 14, 14], f16),), {})
cnt: 1, ((T([64, 384, 14, 14], f16),), {})
cnt: 1, ((T([64, 416, 14, 14], f16),), {})
cnt: 1, ((T([64, 448, 14, 14], f16),), {})
cnt: 1, ((T([64, 480, 14, 14], f16),), {})
cnt: 1, ((T([64, 512, 14, 14], f16),), {})
cnt: 1, ((T([64, 544, 14, 14], f16),), {})
cnt: 1, ((T([64, 576, 14, 14], f16),), {})
cnt: 1, ((T([64, 608, 14, 14], f16),), {})
cnt: 1, ((T([64, 640, 14, 14], f16),), {})
cnt: 1, ((T([64, 672, 14, 14], f16),), {})
cnt: 1, ((T([64, 704, 14, 14], f16),), {})
cnt: 1, ((T([64, 736, 14, 14], f16),), {})
cnt: 1, ((T([64, 768, 14, 14], f16),), {})
cnt: 1, ((T([64, 800, 14, 14], f16),), {})
cnt: 1, ((T([64, 832, 14, 14], f16),), {})
cnt: 1, ((T([64, 864, 14, 14], f16),), {})
cnt: 1, ((T([64, 896, 14, 14], f16),), {})
cnt: 1, ((T([64, 928, 14, 14], f16),), {})
cnt: 1, ((T([64, 960, 14, 14], f16),), {})
cnt: 1, ((T([64, 992, 14, 14], f16),), {})
cnt: 1, ((T([64, 1024, 14, 14], f16),), {})
cnt: 1, ((T([64, 512, 7, 7], f16),), {})
cnt: 16, ((T([64, 128, 7, 7], f16),), {})
cnt: 1, ((T([64, 544, 7, 7], f16),), {})
cnt: 1, ((T([64, 576, 7, 7], f16),), {})
cnt: 1, ((T([64, 608, 7, 7], f16),), {})
cnt: 1, ((T([64, 640, 7, 7], f16),), {})
cnt: 1, ((T([64, 672, 7, 7], f16),), {})
cnt: 1, ((T([64, 704, 7, 7], f16),), {})
cnt: 1, ((T([64, 736, 7, 7], f16),), {})
cnt: 1, ((T([64, 768, 7, 7], f16),), {})
cnt: 1, ((T([64, 800, 7, 7], f16),), {})
cnt: 1, ((T([64, 832, 7, 7], f16),), {})
cnt: 1, ((T([64, 864, 7, 7], f16),), {})
cnt: 1, ((T([64, 896, 7, 7], f16),), {})
cnt: 1, ((T([64, 928, 7, 7], f16),), {})
cnt: 1, ((T([64, 960, 7, 7], f16),), {})
cnt: 1, ((T([64, 992, 7, 7], f16),), {})
cnt: 1, ((T([64, 1024, 7, 7], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([64, 1000], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 1, ((T([64, 1024, 7, 7], f16), T([64, 1024, 7, 7], f16), 0), {})
cnt: 16, ((T([64, 128, 7, 7], f16), T([64, 128, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 992, 7, 7], f16), T([64, 992, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 960, 7, 7], f16), T([64, 960, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 928, 7, 7], f16), T([64, 928, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 896, 7, 7], f16), T([64, 896, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 864, 7, 7], f16), T([64, 864, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 832, 7, 7], f16), T([64, 832, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 800, 7, 7], f16), T([64, 800, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 768, 7, 7], f16), T([64, 768, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 736, 7, 7], f16), T([64, 736, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 704, 7, 7], f16), T([64, 704, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 672, 7, 7], f16), T([64, 672, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 640, 7, 7], f16), T([64, 640, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 608, 7, 7], f16), T([64, 608, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 576, 7, 7], f16), T([64, 576, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 544, 7, 7], f16), T([64, 544, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 1024, 14, 14], f16), T([64, 1024, 14, 14], f16), 0), {})
cnt: 24, ((T([64, 128, 14, 14], f16), T([64, 128, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 992, 14, 14], f16), T([64, 992, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 960, 14, 14], f16), T([64, 960, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 928, 14, 14], f16), T([64, 928, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 896, 14, 14], f16), T([64, 896, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 864, 14, 14], f16), T([64, 864, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 832, 14, 14], f16), T([64, 832, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 800, 14, 14], f16), T([64, 800, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 768, 14, 14], f16), T([64, 768, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 736, 14, 14], f16), T([64, 736, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 704, 14, 14], f16), T([64, 704, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 672, 14, 14], f16), T([64, 672, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 640, 14, 14], f16), T([64, 640, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 608, 14, 14], f16), T([64, 608, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 576, 14, 14], f16), T([64, 576, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 544, 14, 14], f16), T([64, 544, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 480, 14, 14], f16), T([64, 480, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 448, 14, 14], f16), T([64, 448, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 416, 14, 14], f16), T([64, 416, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 384, 14, 14], f16), T([64, 384, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 352, 14, 14], f16), T([64, 352, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 320, 14, 14], f16), T([64, 320, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 288, 14, 14], f16), T([64, 288, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 512, 28, 28], f16), T([64, 512, 28, 28], f16), 0), {})
cnt: 13, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 480, 28, 28], f16), T([64, 480, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 448, 28, 28], f16), T([64, 448, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 416, 28, 28], f16), T([64, 416, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 384, 28, 28], f16), T([64, 384, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 352, 28, 28], f16), T([64, 352, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 320, 28, 28], f16), T([64, 320, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 288, 28, 28], f16), T([64, 288, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 224, 28, 28], f16), T([64, 224, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 192, 28, 28], f16), T([64, 192, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 160, 28, 28], f16), T([64, 160, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 256, 56, 56], f16), T([64, 256, 56, 56], f16), 0), {})
cnt: 7, ((T([64, 128, 56, 56], f16), T([64, 128, 56, 56], f16), 0), {})
cnt: 1, ((T([64, 224, 56, 56], f16), T([64, 224, 56, 56], f16), 0), {})
cnt: 1, ((T([64, 192, 56, 56], f16), T([64, 192, 56, 56], f16), 0), {})
cnt: 1, ((T([64, 160, 56, 56], f16), T([64, 160, 56, 56], f16), 0), {})
cnt: 1, ((T([64, 96, 56, 56], f16), T([64, 96, 56, 56], f16), 0), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16), 0), {})
cnt: 1, ((T([64, 64, 112, 112], f16), T([64, 64, 112, 112], f16), 0), {})

View File

@ -0,0 +1,189 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([64, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 1, ((T([64, 1024, 7, 7], f16), T([64, 1024, 7, 7], f16, stride=(125440, 49, 7, 1))), {})
cnt: 1, ((T([64, 1024, 7, 7], f16, stride=(125440, 49, 7, 1)), T([64, 1024, 7, 7], f16)), {})
cnt: 1, ((T([64, 1024, 7, 7], f16), T([64, 1024, 7, 7], f16)), {})
cnt: 1, ((T([64, 512, 7, 7], f16, stride=(125440, 49, 7, 1)), T([64, 512, 7, 7], f16)), {})
cnt: 16, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16)), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16, stride=(551936, 196, 14, 1))), {})
cnt: 4, ((T([64, 512, 14, 14], f16, stride=(551936, 196, 14, 1)), T([64, 512, 14, 14], f16)), {})
cnt: 4, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16, stride=(200704, 196, 14, 1))), {})
cnt: 4, ((T([64, 512, 14, 14], f16, stride=(200704, 196, 14, 1)), T([64, 512, 14, 14], f16)), {})
cnt: 2, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16, stride=(301056, 196, 14, 1))), {})
cnt: 4, ((T([64, 512, 14, 14], f16, stride=(301056, 196, 14, 1)), T([64, 512, 14, 14], f16)), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16, stride=(401408, 196, 14, 1))), {})
cnt: 3, ((T([64, 512, 14, 14], f16, stride=(401408, 196, 14, 1)), T([64, 512, 14, 14], f16)), {})
cnt: 9, ((T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16)), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16, stride=(903168, 784, 28, 1))), {})
cnt: 3, ((T([64, 256, 28, 28], f16, stride=(903168, 784, 28, 1)), T([64, 256, 28, 28], f16)), {})
cnt: 2, ((T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16, stride=(401408, 784, 28, 1))), {})
cnt: 2, ((T([64, 256, 28, 28], f16, stride=(401408, 784, 28, 1)), T([64, 256, 28, 28], f16)), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16, stride=(602112, 784, 28, 1))), {})
cnt: 2, ((T([64, 256, 28, 28], f16, stride=(602112, 784, 28, 1)), T([64, 256, 28, 28], f16)), {})
cnt: 3, ((T([64, 128, 56, 56], f16), T([64, 128, 56, 56], f16)), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 128, 56, 56], f16, stride=(802816, 3136, 56, 1))), {})
cnt: 1, ((T([64, 128, 56, 56], f16, stride=(802816, 3136, 56, 1)), T([64, 128, 56, 56], f16)), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([64, 32, 112, 112], f16)), {})
Operator: aten.add_.Tensor
cnt: 105, ((T([], i64), 1), {})
cnt: 3, ((T([64, 128, 56, 56], f16), T([64, 128, 56, 56], f16)), {})
cnt: 12, ((T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16)), {})
cnt: 24, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16)), {})
cnt: 3, ((T([64, 1024, 7, 7], f16), T([64, 1024, 7, 7], f16)), {})
Operator: aten.cat.default
cnt: 1, (([T([64, 128, 56, 56], f16), T([64, 128, 56, 56], f16)], 1), {})
cnt: 2, (([T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16), T([64, 128, 28, 28], f16), T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16)], 1), {})
cnt: 4, (([T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16)], 1), {})
cnt: 2, (([T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16), T([64, 256, 14, 14], f16), T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 1024, 7, 7], f16), T([64, 1024, 7, 7], f16), T([64, 512, 7, 7], f16)], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([16, 3, 7, 7], f16), None, [1, 1], [3, 3], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 16, 224, 224], f16), T([16, 16, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 16, 224, 224], f16), T([32, 16, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 32, 56, 56], f16), T([128, 32, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([64, 32, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 64, 112, 112], f16), T([64, 64, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 64, 56, 56], f16), T([128, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 256, 56, 56], f16), T([128, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 9, ((T([64, 128, 28, 28], f16), T([256, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([128, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([128, 128, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 7, ((T([64, 256, 28, 28], f16), T([128, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 7, ((T([64, 128, 28, 28], f16), T([128, 128, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 512, 28, 28], f16), T([256, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 768, 28, 28], f16), T([256, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 1152, 28, 28], f16), T([256, 1152, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 17, ((T([64, 256, 14, 14], f16), T([512, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([256, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([256, 256, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 15, ((T([64, 512, 14, 14], f16), T([256, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 15, ((T([64, 256, 14, 14], f16), T([256, 256, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([64, 1024, 14, 14], f16), T([512, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 1536, 14, 14], f16), T([512, 1536, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 2048, 14, 14], f16), T([512, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 2816, 14, 14], f16), T([512, 2816, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 512, 7, 7], f16), T([1024, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([512, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([512, 512, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 1024, 7, 7], f16), T([512, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([512, 512, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 2560, 7, 7], f16), T([1024, 2560, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 1024, 1, 1], f16), T([1000, 1024, 1, 1], f16), T([1000], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([64, 1000, 1, 1], f16), T([64, 1024, 1, 1], f16), T([1000, 1024, 1, 1], f16), [1000], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([64, 1024, 7, 7], f16), T([64, 2560, 7, 7], f16), T([1024, 2560, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([64, 1024, 7, 7], f16), T([64, 512, 7, 7], f16), T([1024, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16), T([512, 512, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 1024, 7, 7], f16), T([512, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 14, 14], f16), T([512, 512, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16), T([512, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 2816, 14, 14], f16), T([512, 2816, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 17, ((T([64, 512, 14, 14], f16), T([64, 256, 14, 14], f16), T([512, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 15, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16), T([256, 256, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 15, ((T([64, 256, 14, 14], f16), T([64, 512, 14, 14], f16), T([256, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([64, 512, 14, 14], f16), T([64, 1024, 14, 14], f16), T([512, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 512, 14, 14], f16), T([64, 1536, 14, 14], f16), T([512, 1536, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 2048, 14, 14], f16), T([512, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 28, 28], f16), T([256, 256, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16), T([256, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([64, 1152, 28, 28], f16), T([256, 1152, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 9, ((T([64, 256, 28, 28], f16), T([64, 128, 28, 28], f16), T([256, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 7, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16), T([128, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 7, ((T([64, 128, 28, 28], f16), T([64, 256, 28, 28], f16), T([128, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 256, 28, 28], f16), T([64, 512, 28, 28], f16), T([256, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([64, 768, 28, 28], f16), T([256, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 56, 56], f16), T([128, 128, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 128, 56, 56], f16), T([128, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 256, 56, 56], f16), T([128, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 128, 56, 56], f16), T([64, 64, 56, 56], f16), T([128, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16), T([64, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 128, 56, 56], f16), T([64, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 112, 112], f16), T([64, 64, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 112, 112], f16), T([64, 32, 112, 112], f16), T([64, 32, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 32, 56, 56], f16), T([128, 32, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([64, 16, 224, 224], f16), T([32, 16, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 16, 224, 224], f16), T([64, 16, 224, 224], f16), T([16, 16, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 16, 224, 224], f16), T([64, 3, 224, 224], f16), T([16, 3, 7, 7], f16), [0], [1, 1], [3, 3], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([64, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([64, 1024, 7, 7], f16, stride=(1024, 1, 0, 0)), 49), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([64], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([64, 32, 112, 112], f16), [2, 2], [2, 2]), {})
cnt: 3, ((T([64, 128, 56, 56], f16), [2, 2], [2, 2]), {})
cnt: 4, ((T([64, 256, 28, 28], f16), [2, 2], [2, 2]), {})
cnt: 1, ((T([64, 512, 14, 14], f16), [2, 2], [2, 2]), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 14, 14], f16), [2, 2], [2, 2], [0, 0], [1, 1], False, T([64, 512, 7, 7], i64)), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 28, 28], f16), [2, 2], [2, 2], [0, 0], [1, 1], False, T([64, 256, 14, 14], i64)), {})
cnt: 1, ((T([64, 256, 14, 14], f16, stride=(551936, 196, 14, 1)), T([64, 256, 28, 28], f16), [2, 2], [2, 2], [0, 0], [1, 1], False, T([64, 256, 14, 14], i64)), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 56, 56], f16), [2, 2], [2, 2], [0, 0], [1, 1], False, T([64, 128, 28, 28], i64)), {})
cnt: 1, ((T([64, 128, 28, 28], f16, stride=(903168, 784, 28, 1)), T([64, 128, 56, 56], f16), [2, 2], [2, 2], [0, 0], [1, 1], False, T([64, 128, 28, 28], i64)), {})
cnt: 1, ((T([64, 32, 56, 56], f16), T([64, 32, 112, 112], f16), [2, 2], [2, 2], [0, 0], [1, 1], False, T([64, 32, 56, 56], i64)), {})
Operator: aten.mean.dim
cnt: 1, ((T([64, 1024, 7, 7], f16), [-1, -2], True), {})
Operator: aten.native_batch_norm.default
cnt: 2, ((T([64, 16, 224, 224], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 5, ((T([64, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([64, 64, 56, 56], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 14, ((T([64, 256, 28, 28], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 15, ((T([64, 128, 28, 28], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 26, ((T([64, 512, 14, 14], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 31, ((T([64, 256, 14, 14], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([64, 1024, 7, 7], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([64, 512, 7, 7], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 4, ((T([64, 1024, 7, 7], f16), T([64, 1024, 7, 7], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 26, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 31, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 14, ((T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 15, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([64, 128, 56, 56], f16), T([64, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 64, 112, 112], f16), T([64, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([64, 32, 112, 112], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([64, 16, 224, 224], f16), T([64, 16, 224, 224], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f32), T([16], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([64, 1000], f16), T([64], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([64, 1000], f16), T([64], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 2, ((T([64, 16, 224, 224], f16),), {})
cnt: 1, ((T([64, 32, 112, 112], f16),), {})
cnt: 1, ((T([64, 64, 112, 112], f16),), {})
cnt: 3, ((T([64, 64, 56, 56], f16),), {})
cnt: 4, ((T([64, 128, 56, 56], f16),), {})
cnt: 15, ((T([64, 128, 28, 28], f16),), {})
cnt: 13, ((T([64, 256, 28, 28], f16),), {})
cnt: 31, ((T([64, 256, 14, 14], f16),), {})
cnt: 25, ((T([64, 512, 14, 14], f16),), {})
cnt: 3, ((T([64, 512, 7, 7], f16),), {})
cnt: 3, ((T([64, 1024, 7, 7], f16),), {})
Operator: aten.threshold_backward.default
cnt: 3, ((T([64, 1024, 7, 7], f16), T([64, 1024, 7, 7], f16), 0), {})
cnt: 3, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16), 0), {})
cnt: 25, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16), 0), {})
cnt: 31, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16), 0), {})
cnt: 13, ((T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16), 0), {})
cnt: 15, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16), 0), {})
cnt: 4, ((T([64, 128, 56, 56], f16), T([64, 128, 56, 56], f16), 0), {})
cnt: 3, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16), 0), {})
cnt: 1, ((T([64, 64, 112, 112], f16), T([64, 64, 112, 112], f16), 0), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([64, 32, 112, 112], f16), 0), {})
cnt: 2, ((T([64, 16, 224, 224], f16), T([64, 16, 224, 224], f16), 0), {})

View File

@ -0,0 +1,296 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 3, ((T([128, 256, 48, 48], f16), T([128, 256, 48, 48], f16)), {})
cnt: 6, ((T([128, 512, 24, 24], f16), T([128, 512, 24, 24], f16)), {})
cnt: 18, ((T([128, 1536, 12, 12], f16), T([128, 1536, 12, 12], f16)), {})
cnt: 8, ((T([128, 1536, 6, 6], f16), T([128, 1536, 6, 6], f16)), {})
cnt: 1, ((T([128, 128, 48, 48], f16), T([128, 128, 48, 48], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 3072], f16), T([3072, 1000], f16, stride=(1, 3072))), {})
Operator: aten.avg_pool2d.default
cnt: 1, ((T([128, 256, 48, 48], f16), [2, 2], [2, 2], [0, 0], True, False), {})
cnt: 1, ((T([128, 512, 24, 24], f16), [2, 2], [2, 2], [0, 0], True, False), {})
cnt: 1, ((T([128, 1536, 12, 12], f16), [2, 2], [2, 2], [0, 0], True, False), {})
Operator: aten.avg_pool2d_backward.default
cnt: 1, ((T([128, 1536, 6, 6], f16), T([128, 1536, 12, 12], f16), [2, 2], [2, 2], [0, 0], True, False, None), {})
cnt: 1, ((T([128, 512, 12, 12], f16), T([128, 512, 24, 24], f16), [2, 2], [2, 2], [0, 0], True, False, None), {})
cnt: 1, ((T([128, 256, 24, 24], f16), T([128, 256, 48, 48], f16), [2, 2], [2, 2], [0, 0], True, False, None), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 192, 192], f16),), {})
cnt: 1, ((T([128, 256, 48, 48], f16),), {})
cnt: 2, ((T([128, 512, 24, 24], f16),), {})
cnt: 6, ((T([128, 1536, 12, 12], f16),), {})
cnt: 3, ((T([128, 1536, 6, 6], f16),), {})
Operator: aten.constant_pad_nd.default
cnt: 1, ((T([128, 3, 192, 192], f16), [0, 1, 0, 1], 0.0), {})
cnt: 1, ((T([128, 64, 96, 96], f16), [0, 1, 0, 1], 0.0), {})
cnt: 1, ((T([128, 256, 48, 48], f16), [0, 1, 0, 1], 0.0), {})
cnt: 1, ((T([128, 768, 24, 24], f16), [0, 1, 0, 1], 0.0), {})
cnt: 1, ((T([128, 768, 12, 12], f16), [0, 1, 0, 1], 0.0), {})
cnt: 1, ((T([128, 768, 13, 13], f16), [0, -1, 0, -1]), {})
cnt: 1, ((T([128, 768, 25, 25], f16), [0, -1, 0, -1]), {})
cnt: 1, ((T([128, 256, 49, 49], f16), [0, -1, 0, -1]), {})
cnt: 1, ((T([128, 64, 97, 97], f16), [0, -1, 0, -1]), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 193, 193], f16), T([16, 3, 3, 3], f16), T([16], f16), [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 16, 96, 96], f16), T([32, 16, 3, 3], f16), T([32], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 96, 96], f16), T([64, 32, 3, 3], f16), T([64], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 97, 97], f16), T([128, 64, 3, 3], f16), T([128], f16), [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 128, 48, 48], f16), T([256, 128, 1, 1], f16), T([256], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 48, 48], f16), T([128, 128, 1, 1], f16), T([128], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 128, 48, 48], f16), T([128, 128, 3, 3], f16), T([128], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 1, 1], f16), T([128, 256, 1, 1], f16), T([128], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 1, 1], f16), T([256, 128, 1, 1], f16), T([256], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 256, 24, 24], f16), T([512, 256, 1, 1], f16), T([512], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 48, 48], f16), T([256, 256, 1, 1], f16), T([256], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 49, 49], f16), T([256, 128, 3, 3], f16), T([256], f16), [2, 2], [0, 0], [1, 1], False, [0, 0], 2), {})
cnt: 3, ((T([128, 256, 24, 24], f16), T([256, 128, 3, 3], f16), T([256], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 2), {})
cnt: 2, ((T([128, 512, 1, 1], f16), T([256, 512, 1, 1], f16), T([256], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 256, 1, 1], f16), T([512, 256, 1, 1], f16), T([512], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 24, 24], f16), T([256, 512, 1, 1], f16), T([256], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 12, 12], f16), T([1536, 512, 1, 1], f16), T([1536], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 24, 24], f16), T([768, 512, 1, 1], f16), T([768], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 768, 25, 25], f16), T([768, 128, 3, 3], f16), T([768], f16), [2, 2], [0, 0], [1, 1], False, [0, 0], 6), {})
cnt: 11, ((T([128, 768, 12, 12], f16), T([768, 128, 3, 3], f16), T([768], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 6), {})
cnt: 6, ((T([128, 768, 12, 12], f16), T([1536, 768, 1, 1], f16), T([1536], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 9, ((T([128, 1536, 1, 1], f16), T([768, 1536, 1, 1], f16), T([768], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 9, ((T([128, 768, 1, 1], f16), T([1536, 768, 1, 1], f16), T([1536], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 6, ((T([128, 1536, 12, 12], f16), T([768, 1536, 1, 1], f16), T([768], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1536, 6, 6], f16), T([1536, 1536, 1, 1], f16), T([1536], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 768, 13, 13], f16), T([768, 128, 3, 3], f16), T([768], f16), [2, 2], [0, 0], [1, 1], False, [0, 0], 6), {})
cnt: 5, ((T([128, 768, 6, 6], f16), T([768, 128, 3, 3], f16), T([768], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 6), {})
cnt: 3, ((T([128, 768, 6, 6], f16), T([1536, 768, 1, 1], f16), T([1536], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 1536, 6, 6], f16), T([768, 1536, 1, 1], f16), T([768], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1536, 6, 6], f16), T([3072, 1536, 1, 1], f16), T([3072], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 3072, 6, 6], f16), T([128, 1536, 6, 6], f16), T([3072, 1536, 1, 1], f16), [3072], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 9, ((T([128, 1536, 1, 1], f16), T([128, 768, 1, 1], f16), T([1536, 768, 1, 1], f16), [1536], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 9, ((T([128, 768, 1, 1], f16), T([128, 1536, 1, 1], f16), T([768, 1536, 1, 1], f16), [768], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([128, 1536, 6, 6], f16), T([128, 768, 6, 6], f16), T([1536, 768, 1, 1], f16), [1536], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 5, ((T([128, 768, 6, 6], f16), T([128, 768, 6, 6], f16), T([768, 128, 3, 3], f16), [768], [1, 1], [1, 1], [1, 1], False, [0, 0], 6, [True, True, True]), {})
cnt: 2, ((T([128, 768, 6, 6], f16), T([128, 1536, 6, 6], f16), T([768, 1536, 1, 1], f16), [768], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 768, 6, 6], f16), T([128, 768, 13, 13], f16), T([768, 128, 3, 3], f16), [768], [2, 2], [0, 0], [1, 1], False, [0, 0], 6, [True, True, True]), {})
cnt: 6, ((T([128, 768, 12, 12], f16), T([128, 1536, 12, 12], f16), T([768, 1536, 1, 1], f16), [768], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 1536, 6, 6], f16), T([128, 1536, 6, 6], f16), T([1536, 1536, 1, 1], f16), [1536], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 6, ((T([128, 1536, 12, 12], f16), T([128, 768, 12, 12], f16), T([1536, 768, 1, 1], f16), [1536], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 11, ((T([128, 768, 12, 12], f16), T([128, 768, 12, 12], f16), T([768, 128, 3, 3], f16), [768], [1, 1], [1, 1], [1, 1], False, [0, 0], 6, [True, True, True]), {})
cnt: 1, ((T([128, 768, 12, 12], f16), T([128, 768, 25, 25], f16), T([768, 128, 3, 3], f16), [768], [2, 2], [0, 0], [1, 1], False, [0, 0], 6, [True, True, True]), {})
cnt: 1, ((T([128, 768, 24, 24], f16), T([128, 512, 24, 24], f16), T([768, 512, 1, 1], f16), [768], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 1536, 12, 12], f16), T([128, 512, 12, 12], f16), T([1536, 512, 1, 1], f16), [1536], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 512, 1, 1], f16), T([128, 256, 1, 1], f16), T([512, 256, 1, 1], f16), [512], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 256, 1, 1], f16), T([128, 512, 1, 1], f16), T([256, 512, 1, 1], f16), [256], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([128, 512, 24, 24], f16), T([128, 256, 24, 24], f16), T([512, 256, 1, 1], f16), [512], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([128, 256, 24, 24], f16), T([128, 256, 24, 24], f16), T([256, 128, 3, 3], f16), [256], [1, 1], [1, 1], [1, 1], False, [0, 0], 2, [True, True, True]), {})
cnt: 1, ((T([128, 256, 24, 24], f16), T([128, 512, 24, 24], f16), T([256, 512, 1, 1], f16), [256], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 256, 24, 24], f16), T([128, 256, 49, 49], f16), T([256, 128, 3, 3], f16), [256], [2, 2], [0, 0], [1, 1], False, [0, 0], 2, [True, True, True]), {})
cnt: 1, ((T([128, 256, 48, 48], f16), T([128, 256, 48, 48], f16), T([256, 256, 1, 1], f16), [256], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 256, 1, 1], f16), T([128, 128, 1, 1], f16), T([256, 128, 1, 1], f16), [256], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 128, 1, 1], f16), T([128, 256, 1, 1], f16), T([128, 256, 1, 1], f16), [128], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 256, 48, 48], f16), T([128, 128, 48, 48], f16), T([256, 128, 1, 1], f16), [256], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 128, 48, 48], f16), T([128, 128, 48, 48], f16), T([128, 128, 3, 3], f16), [128], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 128, 48, 48], f16), T([128, 128, 48, 48], f16), T([128, 128, 1, 1], f16), [128], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 128, 48, 48], f16), T([128, 64, 97, 97], f16), T([128, 64, 3, 3], f16), [128], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 64, 96, 96], f16), T([128, 32, 96, 96], f16), T([64, 32, 3, 3], f16), [64], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 32, 96, 96], f16), T([128, 16, 96, 96], f16), T([32, 16, 3, 3], f16), [32], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 16, 96, 96], f16), T([128, 3, 193, 193], f16), T([16, 3, 3, 3], f16), [16], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 192, 192], f16), T([128, 3, 192, 192], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 3072, 6, 6], f16, stride=(3072, 1, 0, 0)), 36), {})
cnt: 3, ((T([128, 1536, 6, 6], f16, stride=(1536, 1, 0, 0)), 36), {})
cnt: 6, ((T([128, 1536, 12, 12], f16, stride=(1536, 1, 0, 0)), 144), {})
cnt: 2, ((T([128, 512, 24, 24], f16, stride=(512, 1, 0, 0)), 576), {})
cnt: 1, ((T([128, 256, 48, 48], f16, stride=(256, 1, 0, 0)), 2304), {})
Operator: aten.gelu.default
cnt: 1, ((T([128, 16, 96, 96], f16),), {})
cnt: 1, ((T([128, 32, 96, 96], f16),), {})
cnt: 1, ((T([128, 64, 96, 96], f16),), {})
cnt: 4, ((T([128, 128, 48, 48], f16),), {})
cnt: 2, ((T([128, 256, 48, 48], f16),), {})
cnt: 5, ((T([128, 256, 24, 24], f16),), {})
cnt: 2, ((T([128, 512, 24, 24], f16),), {})
cnt: 1, ((T([128, 768, 24, 24], f16),), {})
cnt: 18, ((T([128, 768, 12, 12], f16),), {})
cnt: 6, ((T([128, 1536, 12, 12], f16),), {})
cnt: 8, ((T([128, 768, 6, 6], f16),), {})
cnt: 2, ((T([128, 1536, 6, 6], f16),), {})
cnt: 1, ((T([128, 3072, 6, 6], f16),), {})
Operator: aten.gelu_backward.default
cnt: 1, ((T([128, 3072, 6, 6], f16), T([128, 3072, 6, 6], f16)), {})
cnt: 8, ((T([128, 768, 6, 6], f16), T([128, 768, 6, 6], f16)), {})
cnt: 2, ((T([128, 1536, 6, 6], f16), T([128, 1536, 6, 6], f16)), {})
cnt: 18, ((T([128, 768, 12, 12], f16), T([128, 768, 12, 12], f16)), {})
cnt: 6, ((T([128, 1536, 12, 12], f16), T([128, 1536, 12, 12], f16)), {})
cnt: 1, ((T([128, 768, 24, 24], f16), T([128, 768, 24, 24], f16)), {})
cnt: 2, ((T([128, 512, 24, 24], f16), T([128, 512, 24, 24], f16)), {})
cnt: 5, ((T([128, 256, 24, 24], f16), T([128, 256, 24, 24], f16)), {})
cnt: 2, ((T([128, 256, 48, 48], f16), T([128, 256, 48, 48], f16)), {})
cnt: 4, ((T([128, 128, 48, 48], f16), T([128, 128, 48, 48], f16)), {})
cnt: 1, ((T([128, 64, 96, 96], f16), T([128, 64, 96, 96], f16)), {})
cnt: 1, ((T([128, 32, 96, 96], f16), T([128, 32, 96, 96], f16)), {})
cnt: 1, ((T([128, 16, 96, 96], f16), T([128, 16, 96, 96], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 256, 48, 48], f16), [2, 3], True), {})
cnt: 2, ((T([128, 512, 24, 24], f16), [2, 3], True), {})
cnt: 6, ((T([128, 1536, 12, 12], f16), [2, 3], True), {})
cnt: 3, ((T([128, 1536, 6, 6], f16), [2, 3], True), {})
cnt: 1, ((T([128, 3072, 6, 6], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 3072], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 3072], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([16, 1, 1, 1], f16), 0.19245008972987526), {})
cnt: 2, ((T([32, 1, 1, 1], f16), 0.08333333333333333), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.05892556509887896), {})
cnt: 2, ((T([128, 1, 1, 1], f16), 0.041666666666666664), {})
cnt: 2, ((T([128, 128, 48, 48], f16), 1.0), {})
cnt: 4, ((T([256, 1, 1, 1], f16), 0.08838834764831845), {})
cnt: 2, ((T([128, 1, 1, 1], f16), 0.08838834764831845), {})
cnt: 4, ((T([128, 1, 1, 1], f16), 0.02946278254943948), {})
cnt: 2, ((T([128, 256, 48, 48], f16), T([128, 256, 1, 1], f16)), {})
cnt: 2, ((T([128, 256, 48, 48], f16), 2.0), {})
cnt: 2, ((T([128, 256, 48, 48], f16), 0.2), {})
cnt: 2, ((T([128, 256, 48, 48], f16), 0.9805806756909201), {})
cnt: 6, ((T([512, 1, 1, 1], f16), 0.0625), {})
cnt: 2, ((T([256, 1, 1, 1], f16), 0.0625), {})
cnt: 8, ((T([256, 1, 1, 1], f16), 0.02946278254943948), {})
cnt: 4, ((T([128, 512, 24, 24], f16), T([128, 512, 1, 1], f16)), {})
cnt: 4, ((T([128, 512, 24, 24], f16), 2.0), {})
cnt: 4, ((T([128, 512, 24, 24], f16), 0.2), {})
cnt: 2, ((T([128, 512, 24, 24], f16), 0.9805806756909201), {})
cnt: 2, ((T([256, 1, 1, 1], f16), 0.04419417382415922), {})
cnt: 2, ((T([128, 512, 24, 24], f16), 0.9622504486493761), {})
cnt: 2, ((T([1536, 1, 1, 1], f16), 0.04419417382415922), {})
cnt: 2, ((T([768, 1, 1, 1], f16), 0.04419417382415922), {})
cnt: 36, ((T([768, 1, 1, 1], f16), 0.02946278254943948), {})
cnt: 18, ((T([1536, 1, 1, 1], f16), 0.03608439182435161), {})
cnt: 12, ((T([128, 1536, 12, 12], f16), T([128, 1536, 1, 1], f16)), {})
cnt: 12, ((T([128, 1536, 12, 12], f16), 2.0), {})
cnt: 12, ((T([128, 1536, 12, 12], f16), 0.2), {})
cnt: 2, ((T([128, 1536, 12, 12], f16), 0.9805806756909201), {})
cnt: 16, ((T([768, 1, 1, 1], f16), 0.02551551815399144), {})
cnt: 2, ((T([128, 1536, 12, 12], f16), 0.9622504486493761), {})
cnt: 2, ((T([128, 1536, 12, 12], f16), 0.9449111825230679), {})
cnt: 2, ((T([128, 1536, 12, 12], f16), 0.9284766908852592), {})
cnt: 2, ((T([128, 1536, 12, 12], f16), 0.9128709291752768), {})
cnt: 2, ((T([128, 1536, 12, 12], f16), 0.8980265101338745), {})
cnt: 2, ((T([1536, 1, 1, 1], f16), 0.02551551815399144), {})
cnt: 6, ((T([128, 1536, 6, 6], f16), T([128, 1536, 1, 1], f16)), {})
cnt: 6, ((T([128, 1536, 6, 6], f16), 2.0), {})
cnt: 6, ((T([128, 1536, 6, 6], f16), 0.2), {})
cnt: 2, ((T([128, 1536, 6, 6], f16), 0.9805806756909201), {})
cnt: 2, ((T([128, 1536, 6, 6], f16), 0.9622504486493761), {})
cnt: 2, ((T([3072, 1, 1, 1], f16), 0.02551551815399144), {})
cnt: 1, ((T([128, 3072, 6, 6], f16), 1.7015043497085571), {})
cnt: 6, ((T([128, 1536, 6, 6], f16), T([128, 1536, 6, 6], f16)), {})
cnt: 3, ((T([128, 1536, 6, 6], f16), T([], f16)), {})
cnt: 8, ((T([128, 768, 6, 6], f16), 1.7015043497085571), {})
cnt: 2, ((T([128, 1536, 6, 6], f16), 1.7015043497085571), {})
cnt: 18, ((T([128, 768, 12, 12], f16), 1.7015043497085571), {})
cnt: 6, ((T([128, 1536, 12, 12], f16), 1.7015043497085571), {})
cnt: 12, ((T([128, 1536, 12, 12], f16), T([128, 1536, 12, 12], f16)), {})
cnt: 6, ((T([128, 1536, 12, 12], f16), T([], f16)), {})
cnt: 1, ((T([128, 768, 24, 24], f16), 1.7015043497085571), {})
cnt: 2, ((T([128, 512, 24, 24], f16), 1.7015043497085571), {})
cnt: 4, ((T([128, 512, 24, 24], f16), T([128, 512, 24, 24], f16)), {})
cnt: 2, ((T([128, 512, 24, 24], f16), T([], f16)), {})
cnt: 5, ((T([128, 256, 24, 24], f16), 1.7015043497085571), {})
cnt: 2, ((T([128, 256, 48, 48], f16), 1.7015043497085571), {})
cnt: 2, ((T([128, 256, 48, 48], f16), T([128, 256, 48, 48], f16)), {})
cnt: 1, ((T([128, 256, 48, 48], f16), T([], f16)), {})
cnt: 4, ((T([128, 128, 48, 48], f16), 1.7015043497085571), {})
cnt: 1, ((T([128, 64, 96, 96], f16), 1.7015043497085571), {})
cnt: 1, ((T([128, 32, 96, 96], f16), 1.7015043497085571), {})
cnt: 1, ((T([128, 16, 96, 96], f16), 1.7015043497085571), {})
Operator: aten.mul_.Tensor
cnt: 1, ((T([128, 16, 96, 96], f16), 1.7015043497085571), {})
cnt: 1, ((T([128, 32, 96, 96], f16), 1.7015043497085571), {})
cnt: 1, ((T([128, 64, 96, 96], f16), 1.7015043497085571), {})
cnt: 4, ((T([128, 128, 48, 48], f16), 1.7015043497085571), {})
cnt: 1, ((T([128, 256, 48, 48], f16), T([], f16)), {})
cnt: 2, ((T([128, 256, 48, 48], f16), 1.7015043497085571), {})
cnt: 5, ((T([128, 256, 24, 24], f16), 1.7015043497085571), {})
cnt: 2, ((T([128, 512, 24, 24], f16), T([], f16)), {})
cnt: 2, ((T([128, 512, 24, 24], f16), 1.7015043497085571), {})
cnt: 1, ((T([128, 768, 24, 24], f16), 1.7015043497085571), {})
cnt: 18, ((T([128, 768, 12, 12], f16), 1.7015043497085571), {})
cnt: 6, ((T([128, 1536, 12, 12], f16), T([], f16)), {})
cnt: 6, ((T([128, 1536, 12, 12], f16), 1.7015043497085571), {})
cnt: 8, ((T([128, 768, 6, 6], f16), 1.7015043497085571), {})
cnt: 3, ((T([128, 1536, 6, 6], f16), T([], f16)), {})
cnt: 2, ((T([128, 1536, 6, 6], f16), 1.7015043497085571), {})
cnt: 1, ((T([128, 3072, 6, 6], f16), 1.7015043497085571), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([1, 16, 27], f16), T([16], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 1, ((T([1, 32, 144], f16), T([32], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 1, ((T([1, 64, 288], f16), T([64], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 1, ((T([1, 128, 576], f16), T([128], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 2, ((T([1, 256, 128], f16), T([256], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 1, ((T([1, 128, 128], f16), T([128], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 2, ((T([1, 128, 1152], f16), T([128], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 3, ((T([1, 512, 256], f16), T([512], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 1, ((T([1, 256, 256], f16), T([256], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 4, ((T([1, 256, 1152], f16), T([256], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 1, ((T([1, 256, 512], f16), T([256], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 1, ((T([1, 1536, 512], f16), T([1536], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 1, ((T([1, 768, 512], f16), T([768], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 18, ((T([1, 768, 1152], f16), T([768], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 9, ((T([1, 1536, 768], f16), T([1536], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 8, ((T([1, 768, 1536], f16), T([768], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 1, ((T([1, 1536, 1536], f16), T([1536], f16), None, None, None, True, 0.0, 1e-05), {})
cnt: 1, ((T([1, 3072, 1536], f16), T([3072], f16), None, None, None, True, 0.0, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([1, 3072, 1536], f16), T([1, 3072, 1536], f16), T([3072], f16), None, None, T([3072], f32), T([3072], f32), True, 1e-05, [True, True, False]), {})
cnt: 9, ((T([1, 1536, 768], f16), T([1, 1536, 768], f16), T([1536], f16), None, None, T([1536], f32), T([1536], f32), True, 1e-05, [True, True, False]), {})
cnt: 18, ((T([1, 768, 1152], f16), T([1, 768, 1152], f16), T([768], f16), None, None, T([768], f32), T([768], f32), True, 1e-05, [True, True, False]), {})
cnt: 8, ((T([1, 768, 1536], f16), T([1, 768, 1536], f16), T([768], f16), None, None, T([768], f32), T([768], f32), True, 1e-05, [True, True, False]), {})
cnt: 1, ((T([1, 1536, 1536], f16), T([1, 1536, 1536], f16), T([1536], f16), None, None, T([1536], f32), T([1536], f32), True, 1e-05, [True, True, False]), {})
cnt: 1, ((T([1, 768, 512], f16), T([1, 768, 512], f16), T([768], f16), None, None, T([768], f32), T([768], f32), True, 1e-05, [True, True, False]), {})
cnt: 1, ((T([1, 1536, 512], f16), T([1, 1536, 512], f16), T([1536], f16), None, None, T([1536], f32), T([1536], f32), True, 1e-05, [True, True, False]), {})
cnt: 3, ((T([1, 512, 256], f16), T([1, 512, 256], f16), T([512], f16), None, None, T([512], f32), T([512], f32), True, 1e-05, [True, True, False]), {})
cnt: 4, ((T([1, 256, 1152], f16), T([1, 256, 1152], f16), T([256], f16), None, None, T([256], f32), T([256], f32), True, 1e-05, [True, True, False]), {})
cnt: 1, ((T([1, 256, 512], f16), T([1, 256, 512], f16), T([256], f16), None, None, T([256], f32), T([256], f32), True, 1e-05, [True, True, False]), {})
cnt: 1, ((T([1, 256, 256], f16), T([1, 256, 256], f16), T([256], f16), None, None, T([256], f32), T([256], f32), True, 1e-05, [True, True, False]), {})
cnt: 2, ((T([1, 256, 128], f16), T([1, 256, 128], f16), T([256], f16), None, None, T([256], f32), T([256], f32), True, 1e-05, [True, True, False]), {})
cnt: 2, ((T([1, 128, 1152], f16), T([1, 128, 1152], f16), T([128], f16), None, None, T([128], f32), T([128], f32), True, 1e-05, [True, True, False]), {})
cnt: 1, ((T([1, 128, 128], f16), T([1, 128, 128], f16), T([128], f16), None, None, T([128], f32), T([128], f32), True, 1e-05, [True, True, False]), {})
cnt: 1, ((T([1, 128, 576], f16), T([1, 128, 576], f16), T([128], f16), None, None, T([128], f32), T([128], f32), True, 1e-05, [True, True, False]), {})
cnt: 1, ((T([1, 64, 288], f16), T([1, 64, 288], f16), T([64], f16), None, None, T([64], f32), T([64], f32), True, 1e-05, [True, True, False]), {})
cnt: 1, ((T([1, 32, 144], f16), T([1, 32, 144], f16), T([32], f16), None, None, T([32], f32), T([32], f32), True, 1e-05, [True, True, False]), {})
cnt: 1, ((T([1, 16, 27], f16), T([1, 16, 27], f16), T([16], f16), None, None, T([16], f32), T([16], f32), True, 1e-05, [True, True, False]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 1, ((T([128, 128, 1, 1], f16),), {})
cnt: 2, ((T([128, 256, 1, 1], f16),), {})
cnt: 9, ((T([128, 768, 1, 1], f16),), {})
Operator: aten.sigmoid.default
cnt: 1, ((T([128, 256, 1, 1], f16),), {})
cnt: 2, ((T([128, 512, 1, 1], f16),), {})
cnt: 9, ((T([128, 1536, 1, 1], f16),), {})
Operator: aten.sigmoid_backward.default
cnt: 9, ((T([128, 1536, 1, 1], f16), T([128, 1536, 1, 1], f16)), {})
cnt: 2, ((T([128, 512, 1, 1], f16), T([128, 512, 1, 1], f16)), {})
cnt: 1, ((T([128, 256, 1, 1], f16), T([128, 256, 1, 1], f16)), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
cnt: 3, ((T([128, 1536, 6, 6], f16), [2, 3], True), {})
cnt: 6, ((T([128, 1536, 12, 12], f16), [2, 3], True), {})
cnt: 2, ((T([128, 512, 24, 24], f16), [2, 3], True), {})
cnt: 1, ((T([128, 256, 48, 48], f16), [2, 3], True), {})
Operator: aten.sum.default
cnt: 3, ((T([128, 1536, 6, 6], f16),), {})
cnt: 6, ((T([128, 1536, 12, 12], f16),), {})
cnt: 2, ((T([128, 512, 24, 24], f16),), {})
cnt: 1, ((T([128, 256, 48, 48], f16),), {})
Operator: aten.threshold_backward.default
cnt: 9, ((T([128, 768, 1, 1], f16), T([128, 768, 1, 1], f16), 0), {})
cnt: 2, ((T([128, 256, 1, 1], f16), T([128, 256, 1, 1], f16), 0), {})
cnt: 1, ((T([128, 128, 1, 1], f16), T([128, 128, 1, 1], f16), 0), {})

View File

@ -0,0 +1,545 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([32, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([32, 1000], f16), T([32, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 111, ((T([], i64), 1), {})
cnt: 1, ((T([32, 256, 56, 56], f16, stride=(928256, 3136, 56, 1)), T([32, 256, 56, 56], f16, stride=(865536, 3136, 56, 1))), {})
cnt: 3, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16, stride=(865536, 3136, 56, 1))), {})
cnt: 1, ((T([32, 512, 28, 28], f16, stride=(501760, 784, 28, 1)), T([32, 512, 28, 28], f16, stride=(451584, 784, 28, 1))), {})
cnt: 7, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16, stride=(451584, 784, 28, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16, stride=(225792, 196, 14, 1)), T([32, 1024, 14, 14], f16, stride=(213248, 196, 14, 1))), {})
cnt: 19, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(213248, 196, 14, 1))), {})
cnt: 1, ((T([32, 2048, 7, 7], f16, stride=(112896, 49, 7, 1)), T([32, 2048, 7, 7], f16, stride=(106624, 49, 7, 1))), {})
cnt: 2, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16, stride=(106624, 49, 7, 1))), {})
cnt: 3, ((T([32, 2176, 7, 7], f16), T([32, 2176, 7, 7], f16)), {})
cnt: 1, ((T([32, 2048, 7, 7], f16, stride=(131712, 49, 7, 1)), T([32, 2048, 7, 7], f16, stride=(125440, 49, 7, 1))), {})
cnt: 1, ((T([32, 512, 7, 7], f16, stride=(131712, 49, 7, 1)), T([32, 512, 7, 7], f16, stride=(125440, 49, 7, 1))), {})
cnt: 1, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16, stride=(119168, 49, 7, 1))), {})
cnt: 1, ((T([32, 384, 7, 7], f16, stride=(25088, 49, 7, 1)), T([32, 384, 7, 7], f16, stride=(119168, 49, 7, 1))), {})
cnt: 1, ((T([32, 2304, 7, 7], f16), T([32, 2304, 7, 7], f16)), {})
cnt: 1, ((T([32, 2432, 14, 14], f16), T([32, 2432, 14, 14], f16)), {})
cnt: 20, ((T([32, 1088, 14, 14], f16), T([32, 1088, 14, 14], f16)), {})
cnt: 1, ((T([32, 1024, 14, 14], f16, stride=(476672, 196, 14, 1)), T([32, 1024, 14, 14], f16, stride=(464128, 196, 14, 1))), {})
cnt: 1, ((T([32, 1344, 14, 14], f16, stride=(476672, 196, 14, 1)), T([32, 1344, 14, 14], f16, stride=(464128, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(451584, 196, 14, 1))), {})
cnt: 1, ((T([32, 1280, 14, 14], f16, stride=(263424, 196, 14, 1)), T([32, 1280, 14, 14], f16, stride=(451584, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(439040, 196, 14, 1))), {})
cnt: 1, ((T([32, 1216, 14, 14], f16, stride=(250880, 196, 14, 1)), T([32, 1216, 14, 14], f16, stride=(439040, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(426496, 196, 14, 1))), {})
cnt: 1, ((T([32, 1152, 14, 14], f16, stride=(238336, 196, 14, 1)), T([32, 1152, 14, 14], f16, stride=(426496, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(413952, 196, 14, 1))), {})
cnt: 1, ((T([32, 1088, 14, 14], f16, stride=(225792, 196, 14, 1)), T([32, 1088, 14, 14], f16, stride=(413952, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(401408, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16, stride=(213248, 196, 14, 1)), T([32, 1024, 14, 14], f16, stride=(401408, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(388864, 196, 14, 1))), {})
cnt: 1, ((T([32, 960, 14, 14], f16, stride=(200704, 196, 14, 1)), T([32, 960, 14, 14], f16, stride=(388864, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(376320, 196, 14, 1))), {})
cnt: 1, ((T([32, 896, 14, 14], f16, stride=(188160, 196, 14, 1)), T([32, 896, 14, 14], f16, stride=(376320, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(363776, 196, 14, 1))), {})
cnt: 1, ((T([32, 832, 14, 14], f16, stride=(175616, 196, 14, 1)), T([32, 832, 14, 14], f16, stride=(363776, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(351232, 196, 14, 1))), {})
cnt: 1, ((T([32, 768, 14, 14], f16, stride=(163072, 196, 14, 1)), T([32, 768, 14, 14], f16, stride=(351232, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(338688, 196, 14, 1))), {})
cnt: 1, ((T([32, 704, 14, 14], f16, stride=(150528, 196, 14, 1)), T([32, 704, 14, 14], f16, stride=(338688, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(326144, 196, 14, 1))), {})
cnt: 1, ((T([32, 640, 14, 14], f16, stride=(137984, 196, 14, 1)), T([32, 640, 14, 14], f16, stride=(326144, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(313600, 196, 14, 1))), {})
cnt: 1, ((T([32, 576, 14, 14], f16, stride=(125440, 196, 14, 1)), T([32, 576, 14, 14], f16, stride=(313600, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(301056, 196, 14, 1))), {})
cnt: 1, ((T([32, 512, 14, 14], f16, stride=(112896, 196, 14, 1)), T([32, 512, 14, 14], f16, stride=(301056, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(288512, 196, 14, 1))), {})
cnt: 1, ((T([32, 448, 14, 14], f16, stride=(100352, 196, 14, 1)), T([32, 448, 14, 14], f16, stride=(288512, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(275968, 196, 14, 1))), {})
cnt: 1, ((T([32, 384, 14, 14], f16, stride=(87808, 196, 14, 1)), T([32, 384, 14, 14], f16, stride=(275968, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(263424, 196, 14, 1))), {})
cnt: 1, ((T([32, 320, 14, 14], f16, stride=(75264, 196, 14, 1)), T([32, 320, 14, 14], f16, stride=(263424, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(250880, 196, 14, 1))), {})
cnt: 1, ((T([32, 256, 14, 14], f16, stride=(62720, 196, 14, 1)), T([32, 256, 14, 14], f16, stride=(250880, 196, 14, 1))), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16, stride=(238336, 196, 14, 1))), {})
cnt: 1, ((T([32, 192, 14, 14], f16, stride=(50176, 196, 14, 1)), T([32, 192, 14, 14], f16, stride=(238336, 196, 14, 1))), {})
cnt: 1, ((T([32, 1152, 14, 14], f16), T([32, 1152, 14, 14], f16)), {})
cnt: 1, ((T([32, 1152, 28, 28], f16), T([32, 1152, 28, 28], f16)), {})
cnt: 8, ((T([32, 576, 28, 28], f16), T([32, 576, 28, 28], f16)), {})
cnt: 1, ((T([32, 512, 28, 28], f16, stride=(903168, 784, 28, 1)), T([32, 512, 28, 28], f16, stride=(852992, 784, 28, 1))), {})
cnt: 1, ((T([32, 576, 28, 28], f16, stride=(903168, 784, 28, 1)), T([32, 576, 28, 28], f16, stride=(852992, 784, 28, 1))), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16, stride=(802816, 784, 28, 1))), {})
cnt: 1, ((T([32, 512, 28, 28], f16, stride=(451584, 784, 28, 1)), T([32, 512, 28, 28], f16, stride=(802816, 784, 28, 1))), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16, stride=(752640, 784, 28, 1))), {})
cnt: 1, ((T([32, 448, 28, 28], f16, stride=(401408, 784, 28, 1)), T([32, 448, 28, 28], f16, stride=(752640, 784, 28, 1))), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16, stride=(702464, 784, 28, 1))), {})
cnt: 1, ((T([32, 384, 28, 28], f16, stride=(351232, 784, 28, 1)), T([32, 384, 28, 28], f16, stride=(702464, 784, 28, 1))), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16, stride=(652288, 784, 28, 1))), {})
cnt: 1, ((T([32, 320, 28, 28], f16, stride=(301056, 784, 28, 1)), T([32, 320, 28, 28], f16, stride=(652288, 784, 28, 1))), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16, stride=(602112, 784, 28, 1))), {})
cnt: 1, ((T([32, 256, 28, 28], f16, stride=(250880, 784, 28, 1)), T([32, 256, 28, 28], f16, stride=(602112, 784, 28, 1))), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16, stride=(551936, 784, 28, 1))), {})
cnt: 1, ((T([32, 192, 28, 28], f16, stride=(200704, 784, 28, 1)), T([32, 192, 28, 28], f16, stride=(551936, 784, 28, 1))), {})
cnt: 1, ((T([32, 640, 28, 28], f16), T([32, 640, 28, 28], f16)), {})
cnt: 1, ((T([32, 376, 56, 56], f16), T([32, 376, 56, 56], f16)), {})
cnt: 4, ((T([32, 276, 56, 56], f16), T([32, 276, 56, 56], f16)), {})
cnt: 1, ((T([32, 256, 56, 56], f16, stride=(1179136, 3136, 56, 1)), T([32, 256, 56, 56], f16, stride=(1116416, 3136, 56, 1))), {})
cnt: 1, ((T([32, 100, 56, 56], f16, stride=(1179136, 3136, 56, 1)), T([32, 100, 56, 56], f16, stride=(1116416, 3136, 56, 1))), {})
cnt: 1, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16, stride=(1053696, 3136, 56, 1))), {})
cnt: 1, ((T([32, 80, 56, 56], f16, stride=(313600, 3136, 56, 1)), T([32, 80, 56, 56], f16, stride=(1053696, 3136, 56, 1))), {})
cnt: 1, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16, stride=(990976, 3136, 56, 1))), {})
cnt: 1, ((T([32, 60, 56, 56], f16, stride=(250880, 3136, 56, 1)), T([32, 60, 56, 56], f16, stride=(990976, 3136, 56, 1))), {})
cnt: 1, ((T([32, 296, 56, 56], f16), T([32, 296, 56, 56], f16)), {})
cnt: 1, ((T([32, 128, 56, 56], f16), T([32, 128, 56, 56], f16)), {})
Operator: aten.cat.default
cnt: 1, (([T([32, 40, 56, 56], f16, stride=(928256, 3136, 56, 1)), T([32, 20, 56, 56], f16, stride=(865536, 3136, 56, 1))], 1), {})
cnt: 1, (([T([32, 256, 56, 56], f16), T([32, 60, 56, 56], f16)], 1), {})
cnt: 1, (([T([32, 60, 56, 56], f16), T([32, 20, 56, 56], f16, stride=(865536, 3136, 56, 1))], 1), {})
cnt: 1, (([T([32, 256, 56, 56], f16), T([32, 80, 56, 56], f16)], 1), {})
cnt: 1, (([T([32, 80, 56, 56], f16), T([32, 20, 56, 56], f16, stride=(865536, 3136, 56, 1))], 1), {})
cnt: 1, (([T([32, 256, 56, 56], f16), T([32, 100, 56, 56], f16)], 1), {})
cnt: 1, (([T([32, 100, 56, 56], f16), T([32, 20, 56, 56], f16, stride=(865536, 3136, 56, 1))], 1), {})
cnt: 1, (([T([32, 256, 56, 56], f16), T([32, 120, 56, 56], f16)], 1), {})
cnt: 1, (([T([32, 128, 28, 28], f16, stride=(501760, 784, 28, 1)), T([32, 64, 28, 28], f16, stride=(451584, 784, 28, 1))], 1), {})
cnt: 1, (([T([32, 512, 28, 28], f16), T([32, 192, 28, 28], f16)], 1), {})
cnt: 1, (([T([32, 192, 28, 28], f16), T([32, 64, 28, 28], f16, stride=(451584, 784, 28, 1))], 1), {})
cnt: 1, (([T([32, 512, 28, 28], f16), T([32, 256, 28, 28], f16)], 1), {})
cnt: 1, (([T([32, 256, 28, 28], f16), T([32, 64, 28, 28], f16, stride=(451584, 784, 28, 1))], 1), {})
cnt: 1, (([T([32, 512, 28, 28], f16), T([32, 320, 28, 28], f16)], 1), {})
cnt: 1, (([T([32, 320, 28, 28], f16), T([32, 64, 28, 28], f16, stride=(451584, 784, 28, 1))], 1), {})
cnt: 1, (([T([32, 512, 28, 28], f16), T([32, 384, 28, 28], f16)], 1), {})
cnt: 1, (([T([32, 384, 28, 28], f16), T([32, 64, 28, 28], f16, stride=(451584, 784, 28, 1))], 1), {})
cnt: 1, (([T([32, 512, 28, 28], f16), T([32, 448, 28, 28], f16)], 1), {})
cnt: 1, (([T([32, 448, 28, 28], f16), T([32, 64, 28, 28], f16, stride=(451584, 784, 28, 1))], 1), {})
cnt: 1, (([T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16)], 1), {})
cnt: 1, (([T([32, 512, 28, 28], f16), T([32, 64, 28, 28], f16, stride=(451584, 784, 28, 1))], 1), {})
cnt: 1, (([T([32, 512, 28, 28], f16), T([32, 576, 28, 28], f16)], 1), {})
cnt: 1, (([T([32, 576, 28, 28], f16), T([32, 64, 28, 28], f16, stride=(451584, 784, 28, 1))], 1), {})
cnt: 1, (([T([32, 512, 28, 28], f16), T([32, 640, 28, 28], f16)], 1), {})
cnt: 1, (([T([32, 128, 14, 14], f16, stride=(225792, 196, 14, 1)), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 192, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 192, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 256, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 256, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 320, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 320, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 384, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 384, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 448, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 448, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 512, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 512, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 576, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 576, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 640, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 640, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 704, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 704, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 768, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 768, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 832, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 832, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 896, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 896, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 960, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 960, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 1088, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 1088, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 1152, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 1152, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 1216, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 1216, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 1280, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 1280, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 1344, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 1344, 14, 14], f16), T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1))], 1), {})
cnt: 1, (([T([32, 1024, 14, 14], f16), T([32, 1408, 14, 14], f16)], 1), {})
cnt: 1, (([T([32, 256, 7, 7], f16, stride=(112896, 49, 7, 1)), T([32, 128, 7, 7], f16, stride=(106624, 49, 7, 1))], 1), {})
cnt: 1, (([T([32, 2048, 7, 7], f16), T([32, 384, 7, 7], f16)], 1), {})
cnt: 1, (([T([32, 384, 7, 7], f16), T([32, 128, 7, 7], f16, stride=(106624, 49, 7, 1))], 1), {})
cnt: 1, (([T([32, 2048, 7, 7], f16), T([32, 512, 7, 7], f16)], 1), {})
cnt: 1, (([T([32, 512, 7, 7], f16), T([32, 128, 7, 7], f16, stride=(106624, 49, 7, 1))], 1), {})
cnt: 1, (([T([32, 2048, 7, 7], f16), T([32, 640, 7, 7], f16)], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([32, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([32, 3, 224, 224], f16), T([128, 3, 7, 7], f16), None, [2, 2], [3, 3], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 128, 56, 56], f16), T([296, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 128, 56, 56], f16), T([200, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([32, 200, 56, 56], f16), T([200, 4, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 50), {})
cnt: 4, ((T([32, 200, 56, 56], f16), T([276, 200, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 316, 56, 56], f16), T([200, 316, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 336, 56, 56], f16), T([200, 336, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 356, 56, 56], f16), T([200, 356, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 376, 56, 56], f16), T([640, 376, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 376, 56, 56], f16), T([400, 376, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 400, 56, 56], f16), T([400, 8, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 50), {})
cnt: 8, ((T([32, 400, 28, 28], f16), T([576, 400, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 704, 28, 28], f16), T([400, 704, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 7, ((T([32, 400, 28, 28], f16), T([400, 8, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 50), {})
cnt: 1, ((T([32, 768, 28, 28], f16), T([400, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 832, 28, 28], f16), T([400, 832, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 896, 28, 28], f16), T([400, 896, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 960, 28, 28], f16), T([400, 960, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1024, 28, 28], f16), T([400, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1088, 28, 28], f16), T([400, 1088, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1152, 28, 28], f16), T([1152, 1152, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1152, 28, 28], f16), T([800, 1152, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 800, 28, 28], f16), T([800, 16, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 50), {})
cnt: 20, ((T([32, 800, 14, 14], f16), T([1088, 800, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1216, 14, 14], f16), T([800, 1216, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 19, ((T([32, 800, 14, 14], f16), T([800, 16, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 50), {})
cnt: 1, ((T([32, 1280, 14, 14], f16), T([800, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1344, 14, 14], f16), T([800, 1344, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1408, 14, 14], f16), T([800, 1408, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1472, 14, 14], f16), T([800, 1472, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1536, 14, 14], f16), T([800, 1536, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1600, 14, 14], f16), T([800, 1600, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1664, 14, 14], f16), T([800, 1664, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1728, 14, 14], f16), T([800, 1728, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1792, 14, 14], f16), T([800, 1792, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1856, 14, 14], f16), T([800, 1856, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1920, 14, 14], f16), T([800, 1920, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1984, 14, 14], f16), T([800, 1984, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 2048, 14, 14], f16), T([800, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 2112, 14, 14], f16), T([800, 2112, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 2176, 14, 14], f16), T([800, 2176, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 2240, 14, 14], f16), T([800, 2240, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 2304, 14, 14], f16), T([800, 2304, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 2368, 14, 14], f16), T([800, 2368, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 2432, 14, 14], f16), T([2304, 2432, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 2432, 14, 14], f16), T([1600, 2432, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1600, 14, 14], f16), T([1600, 32, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 50), {})
cnt: 3, ((T([32, 1600, 7, 7], f16), T([2176, 1600, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 2432, 7, 7], f16), T([1600, 2432, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([32, 1600, 7, 7], f16), T([1600, 32, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 50), {})
cnt: 1, ((T([32, 2560, 7, 7], f16), T([1600, 2560, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 2688, 1, 1], f16), T([1000, 2688, 1, 1], f16), T([1000], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([32, 1000, 1, 1], f16), T([32, 2688, 1, 1], f16), T([1000, 2688, 1, 1], f16), [1000], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([32, 2176, 7, 7], f16), T([32, 1600, 7, 7], f16), T([2176, 1600, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([32, 1600, 7, 7], f16), T([32, 1600, 7, 7], f16), T([1600, 32, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 50, [True, True, False]), {})
cnt: 1, ((T([32, 1600, 7, 7], f16), T([32, 2560, 7, 7], f16), T([1600, 2560, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 1600, 7, 7], f16), T([32, 2432, 7, 7], f16), T([1600, 2432, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 1600, 7, 7], f16), T([32, 1600, 14, 14], f16), T([1600, 32, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 50, [True, True, False]), {})
cnt: 1, ((T([32, 1600, 14, 14], f16), T([32, 2432, 14, 14], f16), T([1600, 2432, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 2304, 7, 7], f16), T([32, 2432, 14, 14], f16), T([2304, 2432, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 20, ((T([32, 1088, 14, 14], f16), T([32, 800, 14, 14], f16), T([1088, 800, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 19, ((T([32, 800, 14, 14], f16), T([32, 800, 14, 14], f16), T([800, 16, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 50, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 2368, 14, 14], f16), T([800, 2368, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 2304, 14, 14], f16), T([800, 2304, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 2240, 14, 14], f16), T([800, 2240, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 2176, 14, 14], f16), T([800, 2176, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 2112, 14, 14], f16), T([800, 2112, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 2048, 14, 14], f16), T([800, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1984, 14, 14], f16), T([800, 1984, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1920, 14, 14], f16), T([800, 1920, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1856, 14, 14], f16), T([800, 1856, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1792, 14, 14], f16), T([800, 1792, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1728, 14, 14], f16), T([800, 1728, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1664, 14, 14], f16), T([800, 1664, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1600, 14, 14], f16), T([800, 1600, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1536, 14, 14], f16), T([800, 1536, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1472, 14, 14], f16), T([800, 1472, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1408, 14, 14], f16), T([800, 1408, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1344, 14, 14], f16), T([800, 1344, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1280, 14, 14], f16), T([800, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 1216, 14, 14], f16), T([800, 1216, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 800, 14, 14], f16), T([32, 800, 28, 28], f16), T([800, 16, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 50, [True, True, False]), {})
cnt: 1, ((T([32, 800, 28, 28], f16), T([32, 1152, 28, 28], f16), T([800, 1152, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 1152, 14, 14], f16), T([32, 1152, 28, 28], f16), T([1152, 1152, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 8, ((T([32, 576, 28, 28], f16), T([32, 400, 28, 28], f16), T([576, 400, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 7, ((T([32, 400, 28, 28], f16), T([32, 400, 28, 28], f16), T([400, 8, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 50, [True, True, False]), {})
cnt: 1, ((T([32, 400, 28, 28], f16), T([32, 1088, 28, 28], f16), T([400, 1088, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 400, 28, 28], f16), T([32, 1024, 28, 28], f16), T([400, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 400, 28, 28], f16), T([32, 960, 28, 28], f16), T([400, 960, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 400, 28, 28], f16), T([32, 896, 28, 28], f16), T([400, 896, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 400, 28, 28], f16), T([32, 832, 28, 28], f16), T([400, 832, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 400, 28, 28], f16), T([32, 768, 28, 28], f16), T([400, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 400, 28, 28], f16), T([32, 704, 28, 28], f16), T([400, 704, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 400, 28, 28], f16), T([32, 400, 56, 56], f16), T([400, 8, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 50, [True, True, False]), {})
cnt: 1, ((T([32, 400, 56, 56], f16), T([32, 376, 56, 56], f16), T([400, 376, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 640, 28, 28], f16), T([32, 376, 56, 56], f16), T([640, 376, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([32, 276, 56, 56], f16), T([32, 200, 56, 56], f16), T([276, 200, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([32, 200, 56, 56], f16), T([32, 200, 56, 56], f16), T([200, 4, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 50, [True, True, False]), {})
cnt: 1, ((T([32, 200, 56, 56], f16), T([32, 356, 56, 56], f16), T([200, 356, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 200, 56, 56], f16), T([32, 336, 56, 56], f16), T([200, 336, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 200, 56, 56], f16), T([32, 316, 56, 56], f16), T([200, 316, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 200, 56, 56], f16), T([32, 128, 56, 56], f16), T([200, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 296, 56, 56], f16), T([32, 128, 56, 56], f16), T([296, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 128, 112, 112], f16), T([32, 3, 224, 224], f16), T([128, 3, 7, 7], f16), [0], [2, 2], [3, 3], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([32, 3, 224, 224], f16), T([32, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([32, 2688, 7, 7], f16, stride=(2688, 1, 0, 0)), 49), {})
Operator: aten.elu.default
cnt: 1, ((T([32, 2688, 7, 7], f16), 1.0), {})
Operator: aten.elu_backward.default
cnt: 1, ((T([32, 2688, 7, 7], f16), 1.0, 1, 1, False, T([32, 2688, 7, 7], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([32], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([32, 128, 112, 112], f16), [3, 3], [2, 2], [1, 1]), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([32, 128, 56, 56], f16), T([32, 128, 112, 112], f16), [3, 3], [2, 2], [1, 1], [1, 1], False, T([32, 128, 56, 56], i64)), {})
Operator: aten.mean.dim
cnt: 1, ((T([32, 2688, 7, 7], f16), [-1, -2], True), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([32, 128, 112, 112], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 0.001), {})
cnt: 2, ((T([32, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 0.001), {})
cnt: 8, ((T([32, 200, 56, 56], f16), T([200], f16), T([200], f16), T([200], f16), T([200], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 316, 56, 56], f16), T([316], f16), T([316], f16), T([316], f16), T([316], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 336, 56, 56], f16), T([336], f16), T([336], f16), T([336], f16), T([336], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 356, 56, 56], f16), T([356], f16), T([356], f16), T([356], f16), T([356], f16), True, 0.1, 0.001), {})
cnt: 2, ((T([32, 376, 56, 56], f16), T([376], f16), T([376], f16), T([376], f16), T([376], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 400, 56, 56], f16), T([400], f16), T([400], f16), T([400], f16), T([400], f16), True, 0.1, 0.001), {})
cnt: 15, ((T([32, 400, 28, 28], f16), T([400], f16), T([400], f16), T([400], f16), T([400], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 704, 28, 28], f16), T([704], f16), T([704], f16), T([704], f16), T([704], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 768, 28, 28], f16), T([768], f16), T([768], f16), T([768], f16), T([768], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 832, 28, 28], f16), T([832], f16), T([832], f16), T([832], f16), T([832], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 896, 28, 28], f16), T([896], f16), T([896], f16), T([896], f16), T([896], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 960, 28, 28], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1024, 28, 28], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1088, 28, 28], f16), T([1088], f16), T([1088], f16), T([1088], f16), T([1088], f16), True, 0.1, 0.001), {})
cnt: 2, ((T([32, 1152, 28, 28], f16), T([1152], f16), T([1152], f16), T([1152], f16), T([1152], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 800, 28, 28], f16), T([800], f16), T([800], f16), T([800], f16), T([800], f16), True, 0.1, 0.001), {})
cnt: 39, ((T([32, 800, 14, 14], f16), T([800], f16), T([800], f16), T([800], f16), T([800], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1216, 14, 14], f16), T([1216], f16), T([1216], f16), T([1216], f16), T([1216], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1280, 14, 14], f16), T([1280], f16), T([1280], f16), T([1280], f16), T([1280], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1344, 14, 14], f16), T([1344], f16), T([1344], f16), T([1344], f16), T([1344], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1408, 14, 14], f16), T([1408], f16), T([1408], f16), T([1408], f16), T([1408], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1472, 14, 14], f16), T([1472], f16), T([1472], f16), T([1472], f16), T([1472], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1536, 14, 14], f16), T([1536], f16), T([1536], f16), T([1536], f16), T([1536], f16), True, 0.1, 0.001), {})
cnt: 2, ((T([32, 1600, 14, 14], f16), T([1600], f16), T([1600], f16), T([1600], f16), T([1600], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1664, 14, 14], f16), T([1664], f16), T([1664], f16), T([1664], f16), T([1664], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1728, 14, 14], f16), T([1728], f16), T([1728], f16), T([1728], f16), T([1728], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1792, 14, 14], f16), T([1792], f16), T([1792], f16), T([1792], f16), T([1792], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1856, 14, 14], f16), T([1856], f16), T([1856], f16), T([1856], f16), T([1856], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1920, 14, 14], f16), T([1920], f16), T([1920], f16), T([1920], f16), T([1920], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 1984, 14, 14], f16), T([1984], f16), T([1984], f16), T([1984], f16), T([1984], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 2048, 14, 14], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 2112, 14, 14], f16), T([2112], f16), T([2112], f16), T([2112], f16), T([2112], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 2176, 14, 14], f16), T([2176], f16), T([2176], f16), T([2176], f16), T([2176], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 2240, 14, 14], f16), T([2240], f16), T([2240], f16), T([2240], f16), T([2240], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 2304, 14, 14], f16), T([2304], f16), T([2304], f16), T([2304], f16), T([2304], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 2368, 14, 14], f16), T([2368], f16), T([2368], f16), T([2368], f16), T([2368], f16), True, 0.1, 0.001), {})
cnt: 2, ((T([32, 2432, 14, 14], f16), T([2432], f16), T([2432], f16), T([2432], f16), T([2432], f16), True, 0.1, 0.001), {})
cnt: 5, ((T([32, 1600, 7, 7], f16), T([1600], f16), T([1600], f16), T([1600], f16), T([1600], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 2432, 7, 7], f16), T([2432], f16), T([2432], f16), T([2432], f16), T([2432], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 2560, 7, 7], f16), T([2560], f16), T([2560], f16), T([2560], f16), T([2560], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([32, 2688, 7, 7], f16), T([2688], f16), T([2688], f16), T([2688], f16), T([2688], f16), True, 0.1, 0.001), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([32, 2688, 7, 7], f16), T([32, 2688, 7, 7], f16), T([2688], f16), T([2688], f16), T([2688], f16), T([2688], f32), T([2688], f32), True, 0.001, [True, True, True]), {})
cnt: 5, ((T([32, 1600, 7, 7], f16), T([32, 1600, 7, 7], f16), T([1600], f16), T([1600], f16), T([1600], f16), T([1600], f32), T([1600], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 2560, 7, 7], f16), T([32, 2560, 7, 7], f16), T([2560], f16), T([2560], f16), T([2560], f16), T([2560], f32), T([2560], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 2432, 7, 7], f16), T([32, 2432, 7, 7], f16), T([2432], f16), T([2432], f16), T([2432], f16), T([2432], f32), T([2432], f32), True, 0.001, [True, True, True]), {})
cnt: 2, ((T([32, 1600, 14, 14], f16), T([32, 1600, 14, 14], f16), T([1600], f16), T([1600], f16), T([1600], f16), T([1600], f32), T([1600], f32), True, 0.001, [True, True, True]), {})
cnt: 2, ((T([32, 2432, 14, 14], f16), T([32, 2432, 14, 14], f16), T([2432], f16), T([2432], f16), T([2432], f16), T([2432], f32), T([2432], f32), True, 0.001, [True, True, True]), {})
cnt: 39, ((T([32, 800, 14, 14], f16), T([32, 800, 14, 14], f16), T([800], f16), T([800], f16), T([800], f16), T([800], f32), T([800], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 2368, 14, 14], f16), T([32, 2368, 14, 14], f16), T([2368], f16), T([2368], f16), T([2368], f16), T([2368], f32), T([2368], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 2304, 14, 14], f16), T([32, 2304, 14, 14], f16), T([2304], f16), T([2304], f16), T([2304], f16), T([2304], f32), T([2304], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 2240, 14, 14], f16), T([32, 2240, 14, 14], f16), T([2240], f16), T([2240], f16), T([2240], f16), T([2240], f32), T([2240], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 2176, 14, 14], f16), T([32, 2176, 14, 14], f16), T([2176], f16), T([2176], f16), T([2176], f16), T([2176], f32), T([2176], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 2112, 14, 14], f16), T([32, 2112, 14, 14], f16), T([2112], f16), T([2112], f16), T([2112], f16), T([2112], f32), T([2112], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 2048, 14, 14], f16), T([32, 2048, 14, 14], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f32), T([2048], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1984, 14, 14], f16), T([32, 1984, 14, 14], f16), T([1984], f16), T([1984], f16), T([1984], f16), T([1984], f32), T([1984], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1920, 14, 14], f16), T([32, 1920, 14, 14], f16), T([1920], f16), T([1920], f16), T([1920], f16), T([1920], f32), T([1920], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1856, 14, 14], f16), T([32, 1856, 14, 14], f16), T([1856], f16), T([1856], f16), T([1856], f16), T([1856], f32), T([1856], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1792, 14, 14], f16), T([32, 1792, 14, 14], f16), T([1792], f16), T([1792], f16), T([1792], f16), T([1792], f32), T([1792], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1728, 14, 14], f16), T([32, 1728, 14, 14], f16), T([1728], f16), T([1728], f16), T([1728], f16), T([1728], f32), T([1728], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1664, 14, 14], f16), T([32, 1664, 14, 14], f16), T([1664], f16), T([1664], f16), T([1664], f16), T([1664], f32), T([1664], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1536, 14, 14], f16), T([32, 1536, 14, 14], f16), T([1536], f16), T([1536], f16), T([1536], f16), T([1536], f32), T([1536], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1472, 14, 14], f16), T([32, 1472, 14, 14], f16), T([1472], f16), T([1472], f16), T([1472], f16), T([1472], f32), T([1472], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1408, 14, 14], f16), T([32, 1408, 14, 14], f16), T([1408], f16), T([1408], f16), T([1408], f16), T([1408], f32), T([1408], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1344, 14, 14], f16), T([32, 1344, 14, 14], f16), T([1344], f16), T([1344], f16), T([1344], f16), T([1344], f32), T([1344], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1280, 14, 14], f16), T([32, 1280, 14, 14], f16), T([1280], f16), T([1280], f16), T([1280], f16), T([1280], f32), T([1280], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1216, 14, 14], f16), T([32, 1216, 14, 14], f16), T([1216], f16), T([1216], f16), T([1216], f16), T([1216], f32), T([1216], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 800, 28, 28], f16), T([32, 800, 28, 28], f16), T([800], f16), T([800], f16), T([800], f16), T([800], f32), T([800], f32), True, 0.001, [True, True, True]), {})
cnt: 2, ((T([32, 1152, 28, 28], f16), T([32, 1152, 28, 28], f16), T([1152], f16), T([1152], f16), T([1152], f16), T([1152], f32), T([1152], f32), True, 0.001, [True, True, True]), {})
cnt: 15, ((T([32, 400, 28, 28], f16), T([32, 400, 28, 28], f16), T([400], f16), T([400], f16), T([400], f16), T([400], f32), T([400], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1088, 28, 28], f16), T([32, 1088, 28, 28], f16), T([1088], f16), T([1088], f16), T([1088], f16), T([1088], f32), T([1088], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 1024, 28, 28], f16), T([32, 1024, 28, 28], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 960, 28, 28], f16), T([32, 960, 28, 28], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f32), T([960], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 896, 28, 28], f16), T([32, 896, 28, 28], f16), T([896], f16), T([896], f16), T([896], f16), T([896], f32), T([896], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 832, 28, 28], f16), T([32, 832, 28, 28], f16), T([832], f16), T([832], f16), T([832], f16), T([832], f32), T([832], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 768, 28, 28], f16), T([32, 768, 28, 28], f16), T([768], f16), T([768], f16), T([768], f16), T([768], f32), T([768], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 704, 28, 28], f16), T([32, 704, 28, 28], f16), T([704], f16), T([704], f16), T([704], f16), T([704], f32), T([704], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 400, 56, 56], f16), T([32, 400, 56, 56], f16), T([400], f16), T([400], f16), T([400], f16), T([400], f32), T([400], f32), True, 0.001, [True, True, True]), {})
cnt: 2, ((T([32, 376, 56, 56], f16), T([32, 376, 56, 56], f16), T([376], f16), T([376], f16), T([376], f16), T([376], f32), T([376], f32), True, 0.001, [True, True, True]), {})
cnt: 8, ((T([32, 200, 56, 56], f16), T([32, 200, 56, 56], f16), T([200], f16), T([200], f16), T([200], f16), T([200], f32), T([200], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 356, 56, 56], f16), T([32, 356, 56, 56], f16), T([356], f16), T([356], f16), T([356], f16), T([356], f32), T([356], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 336, 56, 56], f16), T([32, 336, 56, 56], f16), T([336], f16), T([336], f16), T([336], f16), T([336], f32), T([336], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 316, 56, 56], f16), T([32, 316, 56, 56], f16), T([316], f16), T([316], f16), T([316], f16), T([316], f32), T([316], f32), True, 0.001, [True, True, True]), {})
cnt: 2, ((T([32, 128, 56, 56], f16), T([32, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([32, 128, 112, 112], f16), T([32, 128, 112, 112], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 0.001, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([32, 1000], f16), T([32], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([32, 1000], f16), T([32], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 1, ((T([32, 128, 112, 112], f16),), {})
cnt: 2, ((T([32, 128, 56, 56], f16),), {})
cnt: 8, ((T([32, 200, 56, 56], f16),), {})
cnt: 1, ((T([32, 316, 56, 56], f16),), {})
cnt: 1, ((T([32, 336, 56, 56], f16),), {})
cnt: 1, ((T([32, 356, 56, 56], f16),), {})
cnt: 2, ((T([32, 376, 56, 56], f16),), {})
cnt: 1, ((T([32, 400, 56, 56], f16),), {})
cnt: 15, ((T([32, 400, 28, 28], f16),), {})
cnt: 1, ((T([32, 704, 28, 28], f16),), {})
cnt: 1, ((T([32, 768, 28, 28], f16),), {})
cnt: 1, ((T([32, 832, 28, 28], f16),), {})
cnt: 1, ((T([32, 896, 28, 28], f16),), {})
cnt: 1, ((T([32, 960, 28, 28], f16),), {})
cnt: 1, ((T([32, 1024, 28, 28], f16),), {})
cnt: 1, ((T([32, 1088, 28, 28], f16),), {})
cnt: 2, ((T([32, 1152, 28, 28], f16),), {})
cnt: 1, ((T([32, 800, 28, 28], f16),), {})
cnt: 39, ((T([32, 800, 14, 14], f16),), {})
cnt: 1, ((T([32, 1216, 14, 14], f16),), {})
cnt: 1, ((T([32, 1280, 14, 14], f16),), {})
cnt: 1, ((T([32, 1344, 14, 14], f16),), {})
cnt: 1, ((T([32, 1408, 14, 14], f16),), {})
cnt: 1, ((T([32, 1472, 14, 14], f16),), {})
cnt: 1, ((T([32, 1536, 14, 14], f16),), {})
cnt: 2, ((T([32, 1600, 14, 14], f16),), {})
cnt: 1, ((T([32, 1664, 14, 14], f16),), {})
cnt: 1, ((T([32, 1728, 14, 14], f16),), {})
cnt: 1, ((T([32, 1792, 14, 14], f16),), {})
cnt: 1, ((T([32, 1856, 14, 14], f16),), {})
cnt: 1, ((T([32, 1920, 14, 14], f16),), {})
cnt: 1, ((T([32, 1984, 14, 14], f16),), {})
cnt: 1, ((T([32, 2048, 14, 14], f16),), {})
cnt: 1, ((T([32, 2112, 14, 14], f16),), {})
cnt: 1, ((T([32, 2176, 14, 14], f16),), {})
cnt: 1, ((T([32, 2240, 14, 14], f16),), {})
cnt: 1, ((T([32, 2304, 14, 14], f16),), {})
cnt: 1, ((T([32, 2368, 14, 14], f16),), {})
cnt: 2, ((T([32, 2432, 14, 14], f16),), {})
cnt: 5, ((T([32, 1600, 7, 7], f16),), {})
cnt: 1, ((T([32, 2432, 7, 7], f16),), {})
cnt: 1, ((T([32, 2560, 7, 7], f16),), {})
Operator: aten.slice_backward.default
cnt: 1, ((T([32, 128, 7, 7], f16, stride=(131712, 49, 7, 1)), [32, 128, 7, 7], 3, 0, 9223372036854775807, 1), {})
cnt: 3, ((T([32, 128, 7, 7], f16), [32, 128, 7, 7], 2, 0, 9223372036854775807, 1), {})
cnt: 3, ((T([32, 128, 7, 7], f16), [32, 2176, 7, 7], 1, 2048, 9223372036854775807, 1), {})
cnt: 6, ((T([32, 2176, 7, 7], f16), [32, 2176, 7, 7], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 2048, 7, 7], f16, stride=(131712, 49, 7, 1)), [32, 2048, 7, 7], 3, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([32, 2048, 7, 7], f16), [32, 2048, 7, 7], 2, 0, 9223372036854775807, 1), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), [32, 2176, 7, 7], 1, 0, 2048, 1), {})
cnt: 1, ((T([32, 128, 7, 7], f16, stride=(25088, 49, 7, 1)), [32, 128, 7, 7], 3, 0, 9223372036854775807, 1), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), [32, 2048, 7, 7], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 128, 7, 7], f16, stride=(18816, 49, 7, 1)), [32, 128, 7, 7], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 256, 7, 7], f16, stride=(18816, 49, 7, 1)), [32, 256, 7, 7], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 256, 7, 7], f16), [32, 256, 7, 7], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 256, 7, 7], f16), [32, 2304, 7, 7], 1, 2048, 9223372036854775807, 1), {})
cnt: 2, ((T([32, 2304, 7, 7], f16), [32, 2304, 7, 7], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 2048, 7, 7], f16), [32, 2304, 7, 7], 1, 0, 2048, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(476672, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 20, ((T([32, 64, 14, 14], f16), [32, 64, 14, 14], 2, 0, 9223372036854775807, 1), {})
cnt: 20, ((T([32, 64, 14, 14], f16), [32, 1088, 14, 14], 1, 1024, 9223372036854775807, 1), {})
cnt: 40, ((T([32, 1088, 14, 14], f16), [32, 1088, 14, 14], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 1024, 14, 14], f16, stride=(476672, 196, 14, 1)), [32, 1024, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 21, ((T([32, 1024, 14, 14], f16), [32, 1024, 14, 14], 2, 0, 9223372036854775807, 1), {})
cnt: 20, ((T([32, 1024, 14, 14], f16), [32, 1088, 14, 14], 1, 0, 1024, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(263424, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 20, ((T([32, 1024, 14, 14], f16), [32, 1024, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(250880, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(238336, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(225792, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(213248, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(200704, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(188160, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(175616, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(163072, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(150528, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(137984, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(125440, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(112896, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(100352, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(87808, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(75264, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(62720, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(50176, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 14, 14], f16, stride=(37632, 196, 14, 1)), [32, 64, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 128, 14, 14], f16, stride=(37632, 196, 14, 1)), [32, 128, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 128, 14, 14], f16), [32, 128, 14, 14], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 128, 14, 14], f16), [32, 1152, 14, 14], 1, 1024, 9223372036854775807, 1), {})
cnt: 2, ((T([32, 1152, 14, 14], f16), [32, 1152, 14, 14], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), [32, 1152, 14, 14], 1, 0, 1024, 1), {})
cnt: 1, ((T([32, 64, 28, 28], f16, stride=(903168, 784, 28, 1)), [32, 64, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 8, ((T([32, 64, 28, 28], f16), [32, 64, 28, 28], 2, 0, 9223372036854775807, 1), {})
cnt: 8, ((T([32, 64, 28, 28], f16), [32, 576, 28, 28], 1, 512, 9223372036854775807, 1), {})
cnt: 16, ((T([32, 576, 28, 28], f16), [32, 576, 28, 28], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 512, 28, 28], f16, stride=(903168, 784, 28, 1)), [32, 512, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 9, ((T([32, 512, 28, 28], f16), [32, 512, 28, 28], 2, 0, 9223372036854775807, 1), {})
cnt: 8, ((T([32, 512, 28, 28], f16), [32, 576, 28, 28], 1, 0, 512, 1), {})
cnt: 1, ((T([32, 64, 28, 28], f16, stride=(451584, 784, 28, 1)), [32, 64, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 8, ((T([32, 512, 28, 28], f16), [32, 512, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 28, 28], f16, stride=(401408, 784, 28, 1)), [32, 64, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 28, 28], f16, stride=(351232, 784, 28, 1)), [32, 64, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 28, 28], f16, stride=(301056, 784, 28, 1)), [32, 64, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 28, 28], f16, stride=(250880, 784, 28, 1)), [32, 64, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 28, 28], f16, stride=(200704, 784, 28, 1)), [32, 64, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 64, 28, 28], f16, stride=(150528, 784, 28, 1)), [32, 64, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 128, 28, 28], f16, stride=(150528, 784, 28, 1)), [32, 128, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 128, 28, 28], f16), [32, 128, 28, 28], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 128, 28, 28], f16), [32, 640, 28, 28], 1, 512, 9223372036854775807, 1), {})
cnt: 2, ((T([32, 640, 28, 28], f16), [32, 640, 28, 28], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 512, 28, 28], f16), [32, 640, 28, 28], 1, 0, 512, 1), {})
cnt: 1, ((T([32, 20, 56, 56], f16, stride=(1179136, 3136, 56, 1)), [32, 20, 56, 56], 3, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([32, 20, 56, 56], f16), [32, 20, 56, 56], 2, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([32, 20, 56, 56], f16), [32, 276, 56, 56], 1, 256, 9223372036854775807, 1), {})
cnt: 8, ((T([32, 276, 56, 56], f16), [32, 276, 56, 56], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 256, 56, 56], f16, stride=(1179136, 3136, 56, 1)), [32, 256, 56, 56], 3, 0, 9223372036854775807, 1), {})
cnt: 5, ((T([32, 256, 56, 56], f16), [32, 256, 56, 56], 2, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([32, 256, 56, 56], f16), [32, 276, 56, 56], 1, 0, 256, 1), {})
cnt: 1, ((T([32, 20, 56, 56], f16, stride=(313600, 3136, 56, 1)), [32, 20, 56, 56], 3, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([32, 256, 56, 56], f16), [32, 256, 56, 56], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 20, 56, 56], f16, stride=(250880, 3136, 56, 1)), [32, 20, 56, 56], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 20, 56, 56], f16, stride=(188160, 3136, 56, 1)), [32, 20, 56, 56], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 40, 56, 56], f16, stride=(188160, 3136, 56, 1)), [32, 40, 56, 56], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 40, 56, 56], f16), [32, 40, 56, 56], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 40, 56, 56], f16), [32, 296, 56, 56], 1, 256, 9223372036854775807, 1), {})
cnt: 2, ((T([32, 296, 56, 56], f16), [32, 296, 56, 56], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([32, 256, 56, 56], f16), [32, 296, 56, 56], 1, 0, 256, 1), {})
Operator: aten.threshold_backward.default
cnt: 5, ((T([32, 1600, 7, 7], f16), T([32, 1600, 7, 7], f16), 0), {})
cnt: 1, ((T([32, 2560, 7, 7], f16), T([32, 2560, 7, 7], f16), 0), {})
cnt: 1, ((T([32, 2432, 7, 7], f16), T([32, 2432, 7, 7], f16), 0), {})
cnt: 2, ((T([32, 1600, 14, 14], f16), T([32, 1600, 14, 14], f16), 0), {})
cnt: 2, ((T([32, 2432, 14, 14], f16), T([32, 2432, 14, 14], f16), 0), {})
cnt: 39, ((T([32, 800, 14, 14], f16), T([32, 800, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 2368, 14, 14], f16), T([32, 2368, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 2304, 14, 14], f16), T([32, 2304, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 2240, 14, 14], f16), T([32, 2240, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 2176, 14, 14], f16), T([32, 2176, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 2112, 14, 14], f16), T([32, 2112, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 2048, 14, 14], f16), T([32, 2048, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 1984, 14, 14], f16), T([32, 1984, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 1920, 14, 14], f16), T([32, 1920, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 1856, 14, 14], f16), T([32, 1856, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 1792, 14, 14], f16), T([32, 1792, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 1728, 14, 14], f16), T([32, 1728, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 1664, 14, 14], f16), T([32, 1664, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 1536, 14, 14], f16), T([32, 1536, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 1472, 14, 14], f16), T([32, 1472, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 1408, 14, 14], f16), T([32, 1408, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 1344, 14, 14], f16), T([32, 1344, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 1280, 14, 14], f16), T([32, 1280, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 1216, 14, 14], f16), T([32, 1216, 14, 14], f16), 0), {})
cnt: 1, ((T([32, 800, 28, 28], f16), T([32, 800, 28, 28], f16), 0), {})
cnt: 2, ((T([32, 1152, 28, 28], f16), T([32, 1152, 28, 28], f16), 0), {})
cnt: 15, ((T([32, 400, 28, 28], f16), T([32, 400, 28, 28], f16), 0), {})
cnt: 1, ((T([32, 1088, 28, 28], f16), T([32, 1088, 28, 28], f16), 0), {})
cnt: 1, ((T([32, 1024, 28, 28], f16), T([32, 1024, 28, 28], f16), 0), {})
cnt: 1, ((T([32, 960, 28, 28], f16), T([32, 960, 28, 28], f16), 0), {})
cnt: 1, ((T([32, 896, 28, 28], f16), T([32, 896, 28, 28], f16), 0), {})
cnt: 1, ((T([32, 832, 28, 28], f16), T([32, 832, 28, 28], f16), 0), {})
cnt: 1, ((T([32, 768, 28, 28], f16), T([32, 768, 28, 28], f16), 0), {})
cnt: 1, ((T([32, 704, 28, 28], f16), T([32, 704, 28, 28], f16), 0), {})
cnt: 1, ((T([32, 400, 56, 56], f16), T([32, 400, 56, 56], f16), 0), {})
cnt: 2, ((T([32, 376, 56, 56], f16), T([32, 376, 56, 56], f16), 0), {})
cnt: 8, ((T([32, 200, 56, 56], f16), T([32, 200, 56, 56], f16), 0), {})
cnt: 1, ((T([32, 356, 56, 56], f16), T([32, 356, 56, 56], f16), 0), {})
cnt: 1, ((T([32, 336, 56, 56], f16), T([32, 336, 56, 56], f16), 0), {})
cnt: 1, ((T([32, 316, 56, 56], f16), T([32, 316, 56, 56], f16), 0), {})
cnt: 2, ((T([32, 128, 56, 56], f16), T([32, 128, 56, 56], f16), 0), {})
cnt: 1, ((T([32, 128, 112, 112], f16), T([32, 128, 112, 112], f16), 0), {})

View File

@ -0,0 +1,288 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 2, ((T([512, 256, 256], f16), -1, False), {})
cnt: 1, ((T([512, 64, 64], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 1, ((T([512, 64, 64], f16), T([512, 64, 64], f16), -1, f16), {})
cnt: 2, ((T([512, 256, 256], f16), T([512, 256, 256], f16), -1, f16), {})
Operator: aten._unsafe_view.default
cnt: 4, ((T([128, 64, 16, 16], f16), [512, 16, 256]), {})
cnt: 1, ((T([128, 256, 16, 16], f16), [512, 64, 256]), {})
cnt: 2, ((T([512, 256, 256], f16), [512, 256, 256]), {})
cnt: 4, ((T([512, 16, 16, 16], f16), [131072, 16]), {})
cnt: 4, ((T([131072, 31], f16), [512, 16, 16, 31]), {})
cnt: 2, ((T([512, 16, 16, 16, 16], f16), [512, 256, 256]), {})
cnt: 1, ((T([512, 256, 64], f16), [512, 256, 64]), {})
cnt: 2, ((T([512, 64, 256], f16), [128, 256, 16, 16]), {})
cnt: 1, ((T([128, 512, 16, 16], f16), [512, 128, 256]), {})
cnt: 1, ((T([512, 256, 128], f16), [512, 256, 128]), {})
cnt: 2, ((T([512, 128, 256], f16), [128, 512, 16, 16]), {})
cnt: 2, ((T([128, 64, 8, 8], f16), [512, 16, 64]), {})
cnt: 1, ((T([128, 512, 8, 8], f16), [512, 128, 64]), {})
cnt: 1, ((T([512, 64, 64], f16), [512, 64, 64]), {})
cnt: 2, ((T([512, 8, 8, 16], f16), [32768, 16]), {})
cnt: 2, ((T([32768, 15], f16), [512, 8, 8, 15]), {})
cnt: 1, ((T([512, 8, 8, 8, 8], f16), [512, 64, 64]), {})
cnt: 1, ((T([512, 64, 128], f16), [512, 64, 128]), {})
cnt: 2, ((T([512, 128, 64], f16), [128, 512, 8, 8]), {})
cnt: 1, ((T([512, 8, 8, 16], f16), [512, 64, 16]), {})
cnt: 1, ((T([512, 16, 64], f16), [128, 64, 8, 8]), {})
cnt: 2, ((T([512, 16, 16, 16], f16), [512, 256, 16]), {})
cnt: 2, ((T([512, 16, 256], f16), [128, 64, 16, 16]), {})
Operator: aten.add.Tensor
cnt: 31, ((T([], i64), 1), {})
cnt: 4, ((T([128, 256, 64, 64], f16), T([128, 256, 64, 64], f16)), {})
cnt: 4, ((T([128, 512, 32, 32], f16), T([128, 512, 32, 32], f16)), {})
cnt: 4, ((T([128, 1024, 16, 16], f16), T([128, 1024, 16, 16], f16)), {})
cnt: 2, ((T([512, 16, 16, 16, 16], f16, stride=(8432, 31, 527, 1, 0)), T([512, 16, 16, 16, 16], f16, stride=(8432, 527, 31, 0, 1))), {})
cnt: 2, ((T([512, 256, 256], f16), T([512, 256, 256], f16)), {})
cnt: 3, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16)), {})
cnt: 1, ((T([512, 8, 8, 8, 8], f16, stride=(1080, 15, 135, 1, 0)), T([512, 8, 8, 8, 8], f16, stride=(1080, 135, 15, 0, 1))), {})
cnt: 1, ((T([512, 64, 64], f16), T([512, 64, 64], f16)), {})
cnt: 1, ((T([512, 8, 8, 16], f16, stride=(1024, 16, 128, 1)), T([512, 8, 8, 16], f16)), {})
cnt: 1, ((T([512, 64, 16], f16), T([512, 64, 16], f16)), {})
cnt: 2, ((T([512, 16, 16, 16], f16, stride=(4096, 16, 256, 1)), T([512, 16, 16, 16], f16)), {})
cnt: 2, ((T([512, 256, 16], f16), T([512, 256, 16], f16)), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16)), {})
cnt: 2, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16)), {})
cnt: 3, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 2048], f16), T([2048, 1000], f16, stride=(1, 2048))), {})
Operator: aten.avg_pool2d.default
cnt: 1, ((T([128, 512, 16, 16], f16), [2, 2], [2, 2]), {})
Operator: aten.avg_pool2d_backward.default
cnt: 1, ((T([128, 512, 8, 8], f16), T([128, 512, 16, 16], f16), [2, 2], [2, 2], [0, 0], False, True, None), {})
Operator: aten.bmm.default
cnt: 2, ((T([512, 256, 16], f16, stride=(4096, 1, 256)), T([512, 16, 256], f16)), {})
cnt: 1, ((T([512, 256, 256], f16), T([512, 256, 64], f16, stride=(16384, 1, 256))), {})
cnt: 1, ((T([512, 256, 256], f16), T([512, 256, 128], f16, stride=(32768, 1, 256))), {})
cnt: 1, ((T([512, 64, 16], f16, stride=(1024, 1, 64)), T([512, 16, 64], f16)), {})
cnt: 1, ((T([512, 64, 64], f16), T([512, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 1, ((T([512, 64, 64], f16, stride=(4096, 1, 64)), T([512, 64, 128], f16, stride=(8192, 1, 64))), {})
cnt: 1, ((T([512, 64, 128], f16, stride=(8192, 1, 64)), T([512, 128, 64], f16)), {})
cnt: 1, ((T([512, 16, 64], f16), T([512, 64, 64], f16)), {})
cnt: 1, ((T([512, 64, 64], f16), T([512, 64, 16], f16, stride=(1024, 1, 64))), {})
cnt: 1, ((T([512, 256, 256], f16, stride=(65536, 1, 256)), T([512, 256, 128], f16, stride=(32768, 1, 256))), {})
cnt: 1, ((T([512, 256, 128], f16, stride=(32768, 1, 256)), T([512, 128, 256], f16)), {})
cnt: 2, ((T([512, 16, 256], f16), T([512, 256, 256], f16)), {})
cnt: 2, ((T([512, 256, 256], f16), T([512, 256, 16], f16, stride=(4096, 1, 256))), {})
cnt: 1, ((T([512, 256, 256], f16, stride=(65536, 1, 256)), T([512, 256, 64], f16, stride=(16384, 1, 256))), {})
cnt: 1, ((T([512, 256, 64], f16, stride=(16384, 1, 256)), T([512, 64, 256], f16)), {})
Operator: aten.cat.default
cnt: 1, (([T([128, 64, 8, 8], f16), T([128, 64, 8, 8], f16), T([128, 512, 8, 8], f16)], 1), {})
cnt: 1, (([T([128, 64, 16, 16], f16), T([128, 64, 16, 16], f16), T([128, 512, 16, 16], f16)], 1), {})
cnt: 1, (([T([128, 64, 16, 16], f16), T([128, 64, 16, 16], f16), T([128, 256, 16, 16], f16)], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 256, 256], f16),), {})
cnt: 1, ((T([128, 24, 128, 128], f16),), {})
cnt: 1, ((T([128, 32, 128, 128], f16),), {})
cnt: 1, ((T([128, 64, 128, 128], f16),), {})
cnt: 4, ((T([128, 64, 64, 64], f16),), {})
cnt: 2, ((T([128, 256, 64, 64], f16),), {})
cnt: 1, ((T([128, 128, 64, 64], f16),), {})
cnt: 3, ((T([128, 128, 32, 32], f16),), {})
cnt: 2, ((T([128, 512, 32, 32], f16),), {})
cnt: 1, ((T([128, 256, 32, 32], f16),), {})
cnt: 3, ((T([128, 256, 16, 16], f16),), {})
cnt: 2, ((T([128, 1024, 16, 16], f16),), {})
cnt: 1, ((T([128, 512, 16, 16], f16),), {})
cnt: 3, ((T([128, 512, 8, 8], f16),), {})
cnt: 2, ((T([128, 2048, 8, 8], f16),), {})
Operator: aten.constant_pad_nd.default
cnt: 4, ((T([8192, 16, 31], f16), [0, 1], 0.0), {})
cnt: 4, ((T([8192, 512], f16), [0, 15], 0.0), {})
cnt: 2, ((T([4096, 8, 15], f16), [0, 1], 0.0), {})
cnt: 2, ((T([4096, 128], f16), [0, 7], 0.0), {})
cnt: 2, ((T([4096, 135], f16), [0, -7]), {})
cnt: 2, ((T([4096, 8, 16], f16), [0, -1]), {})
cnt: 4, ((T([8192, 527], f16), [0, -15]), {})
cnt: 4, ((T([8192, 16, 32], f16), [0, -1]), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 256, 256], f16), T([24, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 24, 128, 128], f16), T([32, 24, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([64, 32, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 64, 64], f16), T([64, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 64, 64, 64], f16), T([64, 16, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 4), {})
cnt: 2, ((T([128, 1, 64], f16), T([1, 1, 3], f16), None, [1], [1], [1], False, [0], 1), {})
cnt: 3, ((T([128, 64, 64, 64], f16), T([256, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 64, 64], f16), T([64, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 64, 64], f16), T([128, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 16, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 8), {})
cnt: 2, ((T([128, 1, 128], f16), T([1, 1, 5], f16), None, [1], [2], [1], False, [0], 1), {})
cnt: 2, ((T([128, 128, 32, 32], f16), T([512, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 64, 64], f16), T([512, 256, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 32, 32], f16), T([128, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 32, 32], f16), T([128, 16, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 8), {})
cnt: 1, ((T([128, 512, 32, 32], f16), T([256, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([256, 16, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 16), {})
cnt: 1, ((T([128, 1, 256], f16), T([1, 1, 5], f16), None, [1], [2], [1], False, [0], 1), {})
cnt: 2, ((T([128, 256, 16, 16], f16), T([1024, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 32, 32], f16), T([1024, 512, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1024, 16, 16], f16), T([256, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([384, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1024, 16, 16], f16), T([512, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([640, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 512, 8, 8], f16), T([2048, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1024, 16, 16], f16), T([2048, 1024, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([512, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 8, 8], f16), T([640, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 2, ((T([128, 2048, 8, 8], f16), T([128, 512, 8, 8], f16), T([2048, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 640, 8, 8], f16), T([128, 512, 8, 8], f16), T([640, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 512, 8, 8], f16), T([128, 2048, 8, 8], f16), T([512, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([128, 1024, 16, 16], f16), T([2048, 1024, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 640, 16, 16], f16), T([128, 512, 16, 16], f16), T([640, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([128, 1024, 16, 16], f16), T([512, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 1024, 16, 16], f16), T([128, 256, 16, 16], f16), T([1024, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 384, 16, 16], f16), T([128, 256, 16, 16], f16), T([384, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([128, 1024, 16, 16], f16), T([256, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1024, 16, 16], f16), T([128, 512, 32, 32], f16), T([1024, 512, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1, 256], f16), T([128, 1, 256], f16), T([1, 1, 5], f16), [0], [1], [2], [1], False, [0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([128, 256, 32, 32], f16), T([256, 16, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 16, [True, True, False]), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([128, 512, 32, 32], f16), T([256, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 512, 32, 32], f16), T([128, 128, 32, 32], f16), T([512, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 1, 128], f16), T([128, 1, 128], f16), T([1, 1, 5], f16), [0], [1], [2], [1], False, [0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16), T([128, 16, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 8, [True, True, False]), {})
cnt: 1, ((T([128, 128, 32, 32], f16), T([128, 512, 32, 32], f16), T([128, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 512, 32, 32], f16), T([128, 256, 64, 64], f16), T([512, 256, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 32, 32], f16), T([128, 128, 64, 64], f16), T([128, 16, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 8, [True, True, False]), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 256, 64, 64], f16), T([128, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 256, 64, 64], f16), T([128, 64, 64, 64], f16), T([256, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 1, 64], f16), T([128, 1, 64], f16), T([1, 1, 3], f16), [0], [1], [1], [1], False, [0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16), T([64, 16, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 4, [True, True, False]), {})
cnt: 1, ((T([128, 64, 64, 64], f16), T([128, 256, 64, 64], f16), T([64, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16), T([64, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 128, 128], f16), T([128, 32, 128, 128], f16), T([64, 32, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 24, 128, 128], f16), T([32, 24, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 24, 128, 128], f16), T([128, 3, 256, 256], f16), T([24, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 256, 256], f16), T([128, 3, 256, 256], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 2048, 8, 8], f16, stride=(2048, 1, 0, 0)), 64), {})
cnt: 1, ((T([128, 256, 16, 16], f16, stride=(256, 1, 0, 0)), 256), {})
cnt: 2, ((T([128, 128, 32, 32], f16, stride=(128, 1, 0, 0)), 1024), {})
cnt: 2, ((T([128, 64, 64, 64], f16, stride=(64, 1, 0, 0)), 4096), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([128, 64, 128, 128], f16), [3, 3], [2, 2], [1, 1]), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([128, 64, 64, 64], f16), T([128, 64, 128, 128], f16), [3, 3], [2, 2], [1, 1], [1, 1], False, T([128, 64, 64, 64], i64)), {})
Operator: aten.mean.dim
cnt: 2, ((T([128, 64, 64, 64], f16), [2, 3]), {})
cnt: 2, ((T([128, 128, 32, 32], f16), [2, 3]), {})
cnt: 1, ((T([128, 256, 16, 16], f16), [2, 3]), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 4, ((T([131072, 16], f16), T([16, 31], f16, stride=(1, 16))), {})
cnt: 2, ((T([32768, 16], f16), T([16, 15], f16, stride=(1, 16))), {})
cnt: 1, ((T([128, 1000], f16), T([1000, 2048], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 2048], f16)), {})
cnt: 2, ((T([15, 32768], f16, stride=(1, 15)), T([32768, 16], f16)), {})
cnt: 2, ((T([32768, 15], f16), T([15, 16], f16)), {})
cnt: 4, ((T([31, 131072], f16, stride=(1, 31)), T([131072, 16], f16)), {})
cnt: 4, ((T([131072, 31], f16), T([31, 16], f16)), {})
Operator: aten.mul.Tensor
cnt: 4, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16, stride=(64, 1, 0, 0))), {})
cnt: 4, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16, stride=(128, 1, 0, 0))), {})
cnt: 2, ((T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16, stride=(256, 1, 0, 0))), {})
cnt: 4, ((T([512, 256, 256], f16), 0.25), {})
cnt: 2, ((T([512, 64, 64], f16), 0.25), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16)), {})
cnt: 2, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16)), {})
cnt: 2, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([128, 24, 128, 128], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 64, 128, 128], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 64, 64, 64], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 256, 64, 64], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 128, 32, 32], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 512, 32, 32], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 256, 16, 16], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 1024, 16, 16], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 512, 8, 8], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 2048, 8, 8], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 3, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f32), T([2048], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 512, 8, 8], f16), T([128, 512, 8, 8], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([128, 512, 16, 16], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 1024, 16, 16], f16), T([128, 1024, 16, 16], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([128, 256, 32, 32], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 512, 32, 32], f16), T([128, 512, 32, 32], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 128, 64, 64], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 256, 64, 64], f16), T([128, 256, 64, 64], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 64, 128, 128], f16), T([128, 64, 128, 128], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 32, 128, 128], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 24, 128, 128], f16), T([128, 24, 128, 128], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f32), T([24], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.sigmoid.default
cnt: 2, ((T([128, 1, 64], f16),), {})
cnt: 2, ((T([128, 1, 128], f16),), {})
cnt: 1, ((T([128, 1, 256], f16),), {})
Operator: aten.sigmoid_backward.default
cnt: 1, ((T([128, 1, 256], f16), T([128, 1, 256], f16)), {})
cnt: 2, ((T([128, 1, 128], f16), T([128, 1, 128], f16)), {})
cnt: 2, ((T([128, 1, 64], f16), T([128, 1, 64], f16)), {})
Operator: aten.silu_.default
cnt: 1, ((T([128, 24, 128, 128], f16),), {})
cnt: 1, ((T([128, 32, 128, 128], f16),), {})
cnt: 1, ((T([128, 64, 128, 128], f16),), {})
cnt: 4, ((T([128, 64, 64, 64], f16),), {})
cnt: 2, ((T([128, 256, 64, 64], f16),), {})
cnt: 1, ((T([128, 128, 64, 64], f16),), {})
cnt: 3, ((T([128, 128, 32, 32], f16),), {})
cnt: 2, ((T([128, 512, 32, 32], f16),), {})
cnt: 1, ((T([128, 256, 32, 32], f16),), {})
cnt: 3, ((T([128, 256, 16, 16], f16),), {})
cnt: 2, ((T([128, 1024, 16, 16], f16),), {})
cnt: 1, ((T([128, 512, 16, 16], f16),), {})
cnt: 3, ((T([128, 512, 8, 8], f16),), {})
cnt: 2, ((T([128, 2048, 8, 8], f16),), {})
Operator: aten.silu_backward.default
cnt: 2, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16)), {})
cnt: 3, ((T([128, 512, 8, 8], f16), T([128, 512, 8, 8], f16)), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([128, 512, 16, 16], f16)), {})
cnt: 2, ((T([128, 1024, 16, 16], f16), T([128, 1024, 16, 16], f16)), {})
cnt: 3, ((T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16)), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([128, 256, 32, 32], f16)), {})
cnt: 2, ((T([128, 512, 32, 32], f16), T([128, 512, 32, 32], f16)), {})
cnt: 3, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16)), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 128, 64, 64], f16)), {})
cnt: 2, ((T([128, 256, 64, 64], f16), T([128, 256, 64, 64], f16)), {})
cnt: 4, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16)), {})
cnt: 1, ((T([128, 64, 128, 128], f16), T([128, 64, 128, 128], f16)), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 32, 128, 128], f16)), {})
cnt: 1, ((T([128, 24, 128, 128], f16), T([128, 24, 128, 128], f16)), {})
Operator: aten.slice_backward.default
cnt: 2, ((T([4096, 8, 8], f16), [4096, 8, 15], 2, 7, 9223372036854775807, 1), {})
cnt: 2, ((T([4096, 8, 15], f16), [4096, 9, 15], 1, 0, 8, 1), {})
cnt: 2, ((T([4096, 9, 15], f16), [4096, 9, 15], 0, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([8192, 16, 16], f16), [8192, 16, 31], 2, 15, 9223372036854775807, 1), {})
cnt: 4, ((T([8192, 16, 31], f16), [8192, 17, 31], 1, 0, 16, 1), {})
cnt: 4, ((T([8192, 17, 31], f16), [8192, 17, 31], 0, 0, 9223372036854775807, 1), {})
Operator: aten.split_with_sizes.default
cnt: 1, ((T([128, 384, 16, 16], f16), [64, 64, 256], 1), {})
cnt: 1, ((T([128, 640, 16, 16], f16), [64, 64, 512], 1), {})
cnt: 1, ((T([128, 640, 8, 8], f16), [64, 64, 512], 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
cnt: 1, ((T([512, 8, 8, 8, 8], f16, stride=(4096, 64, 1, 512, 8)), [2], True), {})
cnt: 1, ((T([512, 8, 8, 8, 8], f16, stride=(4096, 512, 8, 64, 1)), [2], True), {})
cnt: 2, ((T([512, 16, 16, 16, 16], f16, stride=(65536, 256, 1, 4096, 16)), [2], True), {})
cnt: 2, ((T([512, 16, 16, 16, 16], f16, stride=(65536, 4096, 16, 256, 1)), [2], True), {})
cnt: 1, ((T([128, 256, 16, 16], f16), [2, 3], True), {})
cnt: 2, ((T([128, 128, 32, 32], f16), [2, 3], True), {})
cnt: 2, ((T([128, 64, 64, 64], f16), [2, 3], True), {})

View File

@ -0,0 +1,343 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 1, ((T([1024, 4, 64, 144], f16), -1, False), {})
cnt: 1, ((T([1024, 4, 16, 144], f16), -1, False), {})
cnt: 1, ((T([1024, 1, 64, 144], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 1, ((T([1024, 1, 64, 144], f16), T([1024, 1, 64, 144], f16), -1, f16), {})
cnt: 1, ((T([1024, 4, 16, 144], f16), T([1024, 4, 16, 144], f16), -1, f16), {})
cnt: 1, ((T([1024, 4, 64, 144], f16), T([1024, 4, 64, 144], f16), -1, f16), {})
Operator: aten._unsafe_view.default
cnt: 1, ((T([1024, 16, 8, 8, 2, 2], f16), [1024, 16, 64, 4]), {})
cnt: 1, ((T([128, 384, 2, 2, 12, 12], f16), [1024, 48, 4, 144]), {})
cnt: 1, ((T([1024, 4, 64, 16], f16), [4096, 64, 16]), {})
cnt: 2, ((T([1024, 4, 16, 144], f16), [4096, 16, 144]), {})
cnt: 1, ((T([4096, 64, 144], f16), [1024, 4, 64, 144]), {})
cnt: 1, ((T([1024, 4, 64, 16], f16), [4096, 8, 8, 16]), {})
cnt: 2, ((T([262144, 23], f16), [4096, 8, 8, 23]), {})
cnt: 1, ((T([4096, 8, 8, 16], f16), [262144, 16]), {})
cnt: 1, ((T([4096, 8, 8, 12, 12], f16), [1024, 4, 64, 144]), {})
cnt: 1, ((T([1024, 4, 144, 32], f16), [4096, 144, 32]), {})
cnt: 1, ((T([4096, 64, 32], f16), [1024, 4, 64, 32]), {})
cnt: 1, ((T([1024, 32, 64, 4], f16), [32768, 8, 8, 2, 2]), {})
cnt: 1, ((T([1024, 16, 4, 4, 2, 2], f16), [1024, 16, 16, 4]), {})
cnt: 1, ((T([128, 640, 2, 2, 12, 12], f16), [1024, 80, 4, 144]), {})
cnt: 1, ((T([1024, 4, 16, 16], f16), [4096, 16, 16]), {})
cnt: 1, ((T([4096, 16, 144], f16), [1024, 4, 16, 144]), {})
cnt: 1, ((T([1024, 4, 16, 16], f16), [4096, 4, 4, 16]), {})
cnt: 2, ((T([65536, 23], f16), [4096, 4, 4, 23]), {})
cnt: 1, ((T([4096, 4, 4, 16], f16), [65536, 16]), {})
cnt: 1, ((T([4096, 4, 4, 12, 12], f16), [1024, 4, 16, 144]), {})
cnt: 1, ((T([1024, 4, 144, 64], f16), [4096, 144, 64]), {})
cnt: 1, ((T([4096, 16, 64], f16), [1024, 4, 16, 64]), {})
cnt: 1, ((T([1024, 64, 16, 4], f16), [65536, 4, 4, 2, 2]), {})
cnt: 1, ((T([1024, 64, 144], f16), [1024, 1, 64, 144]), {})
cnt: 2, ((T([1024, 8, 8, 16], f16), [65536, 16]), {})
cnt: 2, ((T([65536, 23], f16), [1024, 8, 8, 23]), {})
cnt: 1, ((T([1024, 8, 8, 12, 12], f16), [1024, 1, 64, 144]), {})
cnt: 1, ((T([1024, 64, 64], f16), [1024, 1, 64, 64]), {})
cnt: 1, ((T([1024, 64, 64, 1], f16), [65536, 8, 8, 1, 1]), {})
cnt: 1, ((T([1024, 8, 8, 16], f16), [1024, 1, 64, 16]), {})
cnt: 1, ((T([1024, 80, 1, 144], f16), [128, 640, 1, 1, 12, 12]), {})
cnt: 1, ((T([1024, 16, 1, 8, 1, 8], f16), [128, 128, 8, 8]), {})
cnt: 1, ((T([65536, 4, 4, 2, 2], f16), [1024, 64, 16, 4]), {})
cnt: 1, ((T([1024, 4, 16, 64], f16), [4096, 16, 64]), {})
cnt: 1, ((T([4096, 4, 4, 16], f16), [1024, 4, 16, 16]), {})
cnt: 1, ((T([1024, 80, 4, 144], f16), [128, 640, 2, 2, 12, 12]), {})
cnt: 1, ((T([1024, 16, 2, 4, 2, 4], f16), [128, 128, 8, 8]), {})
cnt: 1, ((T([32768, 8, 8, 2, 2], f16), [1024, 32, 64, 4]), {})
cnt: 1, ((T([1024, 4, 64, 32], f16), [4096, 64, 32]), {})
cnt: 1, ((T([4096, 8, 8, 16], f16), [1024, 4, 64, 16]), {})
cnt: 1, ((T([1024, 48, 4, 144], f16), [128, 384, 2, 2, 12, 12]), {})
cnt: 1, ((T([1024, 16, 2, 8, 2, 8], f16), [128, 128, 16, 16]), {})
Operator: aten.add.Tensor
cnt: 31, ((T([], i64), 1), {})
cnt: 4, ((T([128, 256, 64, 64], f16), T([128, 256, 64, 64], f16)), {})
cnt: 4, ((T([128, 512, 32, 32], f16), T([128, 512, 32, 32], f16)), {})
cnt: 4, ((T([128, 1024, 16, 16], f16), T([128, 1024, 16, 16], f16)), {})
cnt: 1, ((T([4096, 8, 8, 12, 12], f16, stride=(1656, 23, 207, 1, 0)), T([4096, 8, 8, 12, 12], f16, stride=(1656, 207, 23, 0, 1))), {})
cnt: 1, ((T([1024, 4, 64, 144], f16), T([1024, 4, 64, 144], f16)), {})
cnt: 1, ((T([4096, 4, 4, 12, 12], f16, stride=(460, 23, 115, 1, 0)), T([4096, 4, 4, 12, 12], f16, stride=(460, 115, 23, 0, 1))), {})
cnt: 1, ((T([1024, 4, 16, 144], f16), T([1024, 4, 16, 144], f16)), {})
cnt: 3, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16)), {})
cnt: 1, ((T([1024, 8, 8, 12, 12], f16, stride=(1656, 23, 207, 1, 0)), T([1024, 8, 8, 12, 12], f16, stride=(1656, 207, 23, 0, 1))), {})
cnt: 1, ((T([1024, 1, 64, 144], f16), T([1024, 1, 64, 144], f16)), {})
cnt: 1, ((T([1024, 8, 8, 16], f16, stride=(1024, 16, 128, 1)), T([1024, 8, 8, 16], f16)), {})
cnt: 1, ((T([1024, 1, 64, 16], f16), T([1024, 1, 64, 16], f16)), {})
cnt: 1, ((T([128, 512, 8, 8], f16), T([128, 512, 8, 8], f16)), {})
cnt: 1, ((T([4096, 4, 4, 16], f16, stride=(256, 16, 64, 1)), T([4096, 4, 4, 16], f16)), {})
cnt: 1, ((T([1024, 4, 16, 16], f16), T([1024, 4, 16, 16], f16)), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([128, 512, 16, 16], f16)), {})
cnt: 1, ((T([4096, 8, 8, 16], f16, stride=(1024, 16, 128, 1)), T([4096, 8, 8, 16], f16)), {})
cnt: 1, ((T([1024, 4, 64, 16], f16), T([1024, 4, 64, 16], f16)), {})
cnt: 2, ((T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16)), {})
cnt: 2, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16)), {})
cnt: 3, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 2048], f16), T([2048, 1000], f16, stride=(1, 2048))), {})
Operator: aten.bmm.default
cnt: 1, ((T([4096, 64, 16], f16), T([4096, 16, 144], f16)), {})
cnt: 1, ((T([4096, 64, 144], f16), T([4096, 144, 32], f16)), {})
cnt: 1, ((T([4096, 16, 16], f16), T([4096, 16, 144], f16)), {})
cnt: 1, ((T([4096, 16, 144], f16), T([4096, 144, 64], f16)), {})
cnt: 1, ((T([1024, 64, 16], f16, stride=(1024, 1, 64)), T([1024, 16, 144], f16, stride=(11520, 144, 1))), {})
cnt: 1, ((T([1024, 64, 144], f16), T([1024, 144, 64], f16, stride=(11520, 1, 144))), {})
cnt: 1, ((T([1024, 144, 64], f16, stride=(9216, 1, 144)), T([1024, 64, 64], f16, stride=(4096, 1, 64))), {})
cnt: 1, ((T([1024, 64, 64], f16, stride=(4096, 1, 64)), T([1024, 64, 144], f16, stride=(11520, 144, 1))), {})
cnt: 1, ((T([1024, 16, 64], f16), T([1024, 64, 144], f16)), {})
cnt: 1, ((T([1024, 64, 144], f16), T([1024, 144, 16], f16, stride=(11520, 1, 144))), {})
cnt: 1, ((T([4096, 144, 16], f16, stride=(2304, 1, 144)), T([4096, 16, 64], f16)), {})
cnt: 1, ((T([4096, 16, 64], f16), T([4096, 64, 144], f16, stride=(9216, 1, 64))), {})
cnt: 1, ((T([4096, 16, 16], f16, stride=(256, 1, 16)), T([4096, 16, 144], f16)), {})
cnt: 1, ((T([4096, 16, 144], f16), T([4096, 144, 16], f16, stride=(2304, 1, 144))), {})
cnt: 1, ((T([4096, 144, 64], f16, stride=(9216, 1, 144)), T([4096, 64, 32], f16)), {})
cnt: 1, ((T([4096, 64, 32], f16), T([4096, 32, 144], f16, stride=(4608, 1, 32))), {})
cnt: 1, ((T([4096, 16, 64], f16, stride=(1024, 1, 16)), T([4096, 64, 144], f16)), {})
cnt: 1, ((T([4096, 64, 144], f16), T([4096, 144, 16], f16, stride=(2304, 1, 144))), {})
Operator: aten.cat.default
cnt: 1, (([T([1024, 1, 144, 16], f16, stride=(2304, 2304, 1, 144)), T([1024, 1, 144, 64], f16)], 3), {})
cnt: 1, (([T([1024, 4, 144, 16], f16, stride=(9216, 2304, 1, 144)), T([1024, 4, 144, 64], f16)], 3), {})
cnt: 1, (([T([1024, 4, 144, 16], f16, stride=(9216, 2304, 1, 144)), T([1024, 4, 144, 32], f16)], 3), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 256, 256], f16),), {})
cnt: 1, ((T([128, 24, 128, 128], f16),), {})
cnt: 1, ((T([128, 32, 128, 128], f16),), {})
cnt: 1, ((T([128, 64, 128, 128], f16),), {})
cnt: 4, ((T([128, 64, 64, 64], f16),), {})
cnt: 2, ((T([128, 256, 64, 64], f16),), {})
cnt: 1, ((T([128, 128, 64, 64], f16),), {})
cnt: 3, ((T([128, 128, 32, 32], f16),), {})
cnt: 2, ((T([128, 512, 32, 32], f16),), {})
cnt: 1, ((T([128, 256, 32, 32], f16),), {})
cnt: 3, ((T([128, 256, 16, 16], f16),), {})
cnt: 2, ((T([128, 1024, 16, 16], f16),), {})
cnt: 1, ((T([128, 512, 16, 16], f16),), {})
cnt: 3, ((T([128, 512, 8, 8], f16),), {})
cnt: 2, ((T([128, 2048, 8, 8], f16),), {})
Operator: aten.constant_pad_nd.default
cnt: 1, ((T([128, 384, 16, 16], f16), [2, 2, 2, 2], 0.0), {})
cnt: 2, ((T([32768, 8, 23], f16), [0, 1], 0.0), {})
cnt: 2, ((T([32768, 192], f16), [0, 15], 0.0), {})
cnt: 1, ((T([128, 640, 16, 16], f16), [2, 2, 2, 2], 0.0), {})
cnt: 2, ((T([16384, 4, 23], f16), [0, 1], 0.0), {})
cnt: 2, ((T([16384, 96], f16), [0, 19], 0.0), {})
cnt: 1, ((T([128, 640, 8, 8], f16), [2, 2, 2, 2], 0.0), {})
cnt: 2, ((T([8192, 8, 23], f16), [0, 1], 0.0), {})
cnt: 2, ((T([8192, 192], f16), [0, 15], 0.0), {})
cnt: 2, ((T([8192, 207], f16), [0, -15]), {})
cnt: 2, ((T([8192, 8, 24], f16), [0, -1]), {})
cnt: 1, ((T([128, 640, 12, 12], f16), [-2, -2, -2, -2]), {})
cnt: 2, ((T([16384, 115], f16), [0, -19]), {})
cnt: 2, ((T([16384, 4, 24], f16), [0, -1]), {})
cnt: 1, ((T([128, 640, 20, 20], f16), [-2, -2, -2, -2]), {})
cnt: 2, ((T([32768, 207], f16), [0, -15]), {})
cnt: 2, ((T([32768, 8, 24], f16), [0, -1]), {})
cnt: 1, ((T([128, 384, 20, 20], f16), [-2, -2, -2, -2]), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 256, 256], f16), T([24, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 24, 128, 128], f16), T([32, 24, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([64, 32, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 64, 64], f16), T([64, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 64, 64, 64], f16), T([64, 16, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 4), {})
cnt: 2, ((T([128, 1, 64], f16), T([1, 1, 3], f16), None, [1], [1], [1], False, [0], 1), {})
cnt: 3, ((T([128, 64, 64, 64], f16), T([256, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 64, 64], f16), T([64, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 64, 64], f16), T([128, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 16, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 8), {})
cnt: 2, ((T([128, 1, 128], f16), T([1, 1, 5], f16), None, [1], [2], [1], False, [0], 1), {})
cnt: 2, ((T([128, 128, 32, 32], f16), T([512, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 64, 64], f16), T([512, 256, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 32, 32], f16), T([128, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 32, 32], f16), T([128, 16, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 8), {})
cnt: 1, ((T([128, 512, 32, 32], f16), T([256, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([256, 16, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 16), {})
cnt: 1, ((T([128, 1, 256], f16), T([1, 1, 5], f16), None, [1], [2], [1], False, [0], 1), {})
cnt: 2, ((T([128, 256, 16, 16], f16), T([1024, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 32, 32], f16), T([1024, 512, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1024, 16, 16], f16), T([256, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([128, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([384, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1024, 16, 16], f16), T([512, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([128, 512, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([640, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 512, 8, 8], f16), T([2048, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1024, 16, 16], f16), T([2048, 1024, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([512, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 8, 8], f16), T([128, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 8, 8], f16), T([640, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 2, ((T([128, 2048, 8, 8], f16), T([128, 512, 8, 8], f16), T([2048, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 640, 8, 8], f16), T([128, 512, 8, 8], f16), T([640, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 8, 8], f16), T([128, 512, 8, 8], f16), T([128, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 512, 8, 8], f16), T([128, 2048, 8, 8], f16), T([512, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([128, 1024, 16, 16], f16), T([2048, 1024, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 640, 16, 16], f16), T([128, 512, 16, 16], f16), T([640, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 8, 8], f16), T([128, 512, 16, 16], f16), T([128, 512, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([128, 1024, 16, 16], f16), T([512, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 1024, 16, 16], f16), T([128, 256, 16, 16], f16), T([1024, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 384, 16, 16], f16), T([128, 256, 16, 16], f16), T([384, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 16, 16], f16), T([128, 256, 16, 16], f16), T([128, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([128, 1024, 16, 16], f16), T([256, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1024, 16, 16], f16), T([128, 512, 32, 32], f16), T([1024, 512, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1, 256], f16), T([128, 1, 256], f16), T([1, 1, 5], f16), [0], [1], [2], [1], False, [0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([128, 256, 32, 32], f16), T([256, 16, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 16, [True, True, False]), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([128, 512, 32, 32], f16), T([256, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 512, 32, 32], f16), T([128, 128, 32, 32], f16), T([512, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 1, 128], f16), T([128, 1, 128], f16), T([1, 1, 5], f16), [0], [1], [2], [1], False, [0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16), T([128, 16, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 8, [True, True, False]), {})
cnt: 1, ((T([128, 128, 32, 32], f16), T([128, 512, 32, 32], f16), T([128, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 512, 32, 32], f16), T([128, 256, 64, 64], f16), T([512, 256, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 32, 32], f16), T([128, 128, 64, 64], f16), T([128, 16, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 8, [True, True, False]), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 256, 64, 64], f16), T([128, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 256, 64, 64], f16), T([128, 64, 64, 64], f16), T([256, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 1, 64], f16), T([128, 1, 64], f16), T([1, 1, 3], f16), [0], [1], [1], [1], False, [0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16), T([64, 16, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 4, [True, True, False]), {})
cnt: 1, ((T([128, 64, 64, 64], f16), T([128, 256, 64, 64], f16), T([64, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16), T([64, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 128, 128], f16), T([128, 32, 128, 128], f16), T([64, 32, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 24, 128, 128], f16), T([32, 24, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 24, 128, 128], f16), T([128, 3, 256, 256], f16), T([24, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 256, 256], f16), T([128, 3, 256, 256], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 2048, 8, 8], f16, stride=(2048, 1, 0, 0)), 64), {})
cnt: 1, ((T([128, 256, 16, 16], f16, stride=(256, 1, 0, 0)), 256), {})
cnt: 2, ((T([128, 128, 32, 32], f16, stride=(128, 1, 0, 0)), 1024), {})
cnt: 2, ((T([128, 64, 64, 64], f16, stride=(64, 1, 0, 0)), 4096), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([128, 64, 128, 128], f16), [3, 3], [2, 2], [1, 1]), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([128, 64, 64, 64], f16), T([128, 64, 128, 128], f16), [3, 3], [2, 2], [1, 1], [1, 1], False, T([128, 64, 64, 64], i64)), {})
Operator: aten.mean.dim
cnt: 2, ((T([128, 64, 64, 64], f16), [2, 3]), {})
cnt: 2, ((T([128, 128, 32, 32], f16), [2, 3]), {})
cnt: 1, ((T([128, 256, 16, 16], f16), [2, 3]), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 2, ((T([262144, 16], f16), T([16, 23], f16, stride=(1, 16))), {})
cnt: 4, ((T([65536, 16], f16), T([16, 23], f16, stride=(1, 16))), {})
cnt: 1, ((T([128, 1000], f16), T([1000, 2048], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 2048], f16)), {})
cnt: 4, ((T([23, 65536], f16, stride=(1, 23)), T([65536, 16], f16)), {})
cnt: 4, ((T([65536, 23], f16), T([23, 16], f16)), {})
cnt: 2, ((T([23, 262144], f16, stride=(1, 23)), T([262144, 16], f16)), {})
cnt: 2, ((T([262144, 23], f16), T([23, 16], f16)), {})
Operator: aten.mul.Tensor
cnt: 4, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16, stride=(64, 1, 0, 0))), {})
cnt: 4, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16, stride=(128, 1, 0, 0))), {})
cnt: 2, ((T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16, stride=(256, 1, 0, 0))), {})
cnt: 2, ((T([1024, 4, 64, 144], f16), 0.25), {})
cnt: 2, ((T([1024, 4, 16, 144], f16), 0.25), {})
cnt: 2, ((T([1024, 1, 64, 144], f16), 0.25), {})
cnt: 1, ((T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16)), {})
cnt: 2, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16)), {})
cnt: 2, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([128, 24, 128, 128], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 64, 128, 128], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 64, 64, 64], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 256, 64, 64], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 128, 32, 32], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 512, 32, 32], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 256, 16, 16], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 1024, 16, 16], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 512, 8, 8], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 2048, 8, 8], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 3, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f32), T([2048], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 512, 8, 8], f16), T([128, 512, 8, 8], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([128, 512, 16, 16], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 1024, 16, 16], f16), T([128, 1024, 16, 16], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([128, 256, 32, 32], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 512, 32, 32], f16), T([128, 512, 32, 32], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 128, 64, 64], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 256, 64, 64], f16), T([128, 256, 64, 64], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 64, 128, 128], f16), T([128, 64, 128, 128], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 32, 128, 128], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 24, 128, 128], f16), T([128, 24, 128, 128], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f32), T([24], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.sigmoid.default
cnt: 2, ((T([128, 1, 64], f16),), {})
cnt: 2, ((T([128, 1, 128], f16),), {})
cnt: 1, ((T([128, 1, 256], f16),), {})
Operator: aten.sigmoid_backward.default
cnt: 1, ((T([128, 1, 256], f16), T([128, 1, 256], f16)), {})
cnt: 2, ((T([128, 1, 128], f16), T([128, 1, 128], f16)), {})
cnt: 2, ((T([128, 1, 64], f16), T([128, 1, 64], f16)), {})
Operator: aten.silu_.default
cnt: 1, ((T([128, 24, 128, 128], f16),), {})
cnt: 1, ((T([128, 32, 128, 128], f16),), {})
cnt: 1, ((T([128, 64, 128, 128], f16),), {})
cnt: 4, ((T([128, 64, 64, 64], f16),), {})
cnt: 2, ((T([128, 256, 64, 64], f16),), {})
cnt: 1, ((T([128, 128, 64, 64], f16),), {})
cnt: 3, ((T([128, 128, 32, 32], f16),), {})
cnt: 2, ((T([128, 512, 32, 32], f16),), {})
cnt: 1, ((T([128, 256, 32, 32], f16),), {})
cnt: 3, ((T([128, 256, 16, 16], f16),), {})
cnt: 2, ((T([128, 1024, 16, 16], f16),), {})
cnt: 1, ((T([128, 512, 16, 16], f16),), {})
cnt: 3, ((T([128, 512, 8, 8], f16),), {})
cnt: 2, ((T([128, 2048, 8, 8], f16),), {})
Operator: aten.silu_backward.default
cnt: 2, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16)), {})
cnt: 3, ((T([128, 512, 8, 8], f16), T([128, 512, 8, 8], f16)), {})
cnt: 1, ((T([128, 512, 16, 16], f16), T([128, 512, 16, 16], f16)), {})
cnt: 2, ((T([128, 1024, 16, 16], f16), T([128, 1024, 16, 16], f16)), {})
cnt: 3, ((T([128, 256, 16, 16], f16), T([128, 256, 16, 16], f16)), {})
cnt: 1, ((T([128, 256, 32, 32], f16), T([128, 256, 32, 32], f16)), {})
cnt: 2, ((T([128, 512, 32, 32], f16), T([128, 512, 32, 32], f16)), {})
cnt: 3, ((T([128, 128, 32, 32], f16), T([128, 128, 32, 32], f16)), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 128, 64, 64], f16)), {})
cnt: 2, ((T([128, 256, 64, 64], f16), T([128, 256, 64, 64], f16)), {})
cnt: 4, ((T([128, 64, 64, 64], f16), T([128, 64, 64, 64], f16)), {})
cnt: 1, ((T([128, 64, 128, 128], f16), T([128, 64, 128, 128], f16)), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 32, 128, 128], f16)), {})
cnt: 1, ((T([128, 24, 128, 128], f16), T([128, 24, 128, 128], f16)), {})
Operator: aten.slice_backward.default
cnt: 2, ((T([8192, 8, 12], f16), [8192, 8, 23], 2, 11, 9223372036854775807, 1), {})
cnt: 2, ((T([8192, 8, 23], f16), [8192, 9, 23], 1, 0, 8, 1), {})
cnt: 2, ((T([8192, 9, 23], f16), [8192, 9, 23], 0, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([16384, 4, 12], f16), [16384, 4, 23], 2, 11, 9223372036854775807, 1), {})
cnt: 2, ((T([16384, 4, 23], f16), [16384, 5, 23], 1, 0, 4, 1), {})
cnt: 2, ((T([16384, 5, 23], f16), [16384, 5, 23], 0, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([32768, 8, 12], f16), [32768, 8, 23], 2, 11, 9223372036854775807, 1), {})
cnt: 2, ((T([32768, 8, 23], f16), [32768, 9, 23], 1, 0, 8, 1), {})
cnt: 2, ((T([32768, 9, 23], f16), [32768, 9, 23], 0, 0, 9223372036854775807, 1), {})
Operator: aten.split_with_sizes.default
cnt: 1, ((T([1024, 4, 144, 48], f16, stride=(27648, 144, 1, 576)), [16, 32], -1), {})
cnt: 1, ((T([1024, 4, 144, 80], f16, stride=(46080, 144, 1, 576)), [16, 64], -1), {})
cnt: 1, ((T([1024, 1, 144, 80], f16, stride=(11520, 144, 1, 144)), [16, 64], -1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
cnt: 1, ((T([1024, 8, 12, 8, 12], f16, stride=(9216, 144, 1, 1152, 12)), [2], True), {})
cnt: 1, ((T([1024, 8, 12, 8, 12], f16, stride=(9216, 1152, 12, 144, 1)), [2], True), {})
cnt: 1, ((T([4096, 4, 12, 4, 12], f16, stride=(2304, 144, 1, 576, 12)), [2], True), {})
cnt: 1, ((T([4096, 4, 12, 4, 12], f16, stride=(2304, 576, 12, 144, 1)), [2], True), {})
cnt: 1, ((T([4096, 8, 12, 8, 12], f16, stride=(9216, 144, 1, 1152, 12)), [2], True), {})
cnt: 1, ((T([4096, 8, 12, 8, 12], f16, stride=(9216, 1152, 12, 144, 1)), [2], True), {})
cnt: 1, ((T([128, 256, 16, 16], f16), [2, 3], True), {})
cnt: 2, ((T([128, 128, 32, 32], f16), [2, 3], True), {})
cnt: 2, ((T([128, 64, 64, 64], f16), [2, 3], True), {})
Operator: aten.unfold_backward.default
cnt: 1, ((T([128, 640, 1, 1, 12, 12], f16), [128, 640, 1, 12, 12], 3, 12, 8), {})
cnt: 1, ((T([128, 640, 1, 12, 12], f16), [128, 640, 12, 12], 2, 12, 8), {})
cnt: 1, ((T([128, 640, 2, 2, 12, 12], f16), [128, 640, 2, 20, 12], 3, 12, 8), {})
cnt: 1, ((T([128, 640, 2, 20, 12], f16), [128, 640, 20, 20], 2, 12, 8), {})
cnt: 1, ((T([128, 384, 2, 2, 12, 12], f16), [128, 384, 2, 20, 12], 3, 12, 8), {})
cnt: 1, ((T([128, 384, 2, 20, 12], f16), [128, 384, 20, 20], 2, 12, 8), {})

View File

@ -0,0 +1,195 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([64, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 5, ((T([64, 2048, 7, 7], f16), T([64, 2048, 7, 7], f16)), {})
cnt: 46, ((T([64, 1024, 14, 14], f16), T([64, 1024, 14, 14], f16)), {})
cnt: 8, ((T([64, 512, 28, 28], f16), T([64, 512, 28, 28], f16)), {})
cnt: 6, ((T([64, 256, 56, 56], f16), T([64, 256, 56, 56], f16)), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16)), {})
Operator: aten.add_.Tensor
cnt: 106, ((T([], i64), 1), {})
cnt: 3, ((T([64, 256, 56, 56], f16), T([64, 256, 56, 56], f16)), {})
cnt: 4, ((T([64, 512, 28, 28], f16), T([64, 512, 28, 28], f16)), {})
cnt: 23, ((T([64, 1024, 14, 14], f16), T([64, 1024, 14, 14], f16)), {})
cnt: 3, ((T([64, 2048, 7, 7], f16), T([64, 2048, 7, 7], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([64, 2048], f16), T([2048, 1000], f16, stride=(1, 2048))), {})
Operator: aten.avg_pool2d.default
cnt: 1, ((T([64, 256, 56, 56], f16), [2, 2], [2, 2], [0, 0], True, False), {})
cnt: 1, ((T([64, 512, 28, 28], f16), [2, 2], [2, 2], [0, 0], True, False), {})
cnt: 1, ((T([64, 1024, 14, 14], f16), [2, 2], [2, 2], [0, 0], True, False), {})
Operator: aten.avg_pool2d_backward.default
cnt: 1, ((T([64, 1024, 7, 7], f16), T([64, 1024, 14, 14], f16), [2, 2], [2, 2], [0, 0], True, False, None), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 512, 28, 28], f16), [2, 2], [2, 2], [0, 0], True, False, None), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([64, 256, 56, 56], f16), [2, 2], [2, 2], [0, 0], True, False, None), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([32, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([32, 32, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([64, 32, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 64, 56, 56], f16), T([64, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([64, 64, 56, 56], f16), T([256, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 1, 256], f16), T([1, 1, 5], f16), None, [1], [2], [1], False, [0], 1), {})
cnt: 2, ((T([64, 256, 56, 56], f16), T([64, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 256, 56, 56], f16), T([128, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([128, 128, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([64, 128, 28, 28], f16), T([512, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([64, 1, 512], f16), T([1, 1, 5], f16), None, [1], [2], [1], False, [0], 1), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([512, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 512, 28, 28], f16), T([128, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 128, 28, 28], f16), T([128, 128, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 512, 28, 28], f16), T([256, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([256, 256, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 23, ((T([64, 256, 14, 14], f16), T([1024, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 23, ((T([64, 1, 1024], f16), T([1, 1, 5], f16), None, [1], [2], [1], False, [0], 1), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([1024, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 22, ((T([64, 1024, 14, 14], f16), T([256, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 22, ((T([64, 256, 14, 14], f16), T([256, 256, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 1024, 14, 14], f16), T([512, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([512, 512, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 512, 7, 7], f16), T([2048, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 1, 2048], f16), T([1, 1, 7], f16), None, [1], [3], [1], False, [0], 1), {})
cnt: 1, ((T([64, 1024, 7, 7], f16), T([2048, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 2048, 7, 7], f16), T([512, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 512, 7, 7], f16), T([512, 512, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 3, ((T([64, 1, 2048], f16), T([64, 1, 2048], f16), T([1, 1, 7], f16), [0], [1], [3], [1], False, [0], 1, [True, True, False]), {})
cnt: 3, ((T([64, 2048, 7, 7], f16), T([64, 512, 7, 7], f16), T([2048, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16), T([512, 512, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 512, 7, 7], f16), T([64, 2048, 7, 7], f16), T([512, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 2048, 7, 7], f16), T([64, 1024, 7, 7], f16), T([2048, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 512, 7, 7], f16), T([64, 512, 14, 14], f16), T([512, 512, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 1024, 14, 14], f16), T([512, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 23, ((T([64, 1, 1024], f16), T([64, 1, 1024], f16), T([1, 1, 5], f16), [0], [1], [2], [1], False, [0], 1, [True, True, False]), {})
cnt: 23, ((T([64, 1024, 14, 14], f16), T([64, 256, 14, 14], f16), T([1024, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 22, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16), T([256, 256, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 22, ((T([64, 256, 14, 14], f16), T([64, 1024, 14, 14], f16), T([256, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 1024, 14, 14], f16), T([64, 512, 14, 14], f16), T([1024, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 256, 14, 14], f16), T([64, 256, 28, 28], f16), T([256, 256, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([64, 512, 28, 28], f16), T([256, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([64, 1, 512], f16), T([64, 1, 512], f16), T([1, 1, 5], f16), [0], [1], [2], [1], False, [0], 1, [True, True, False]), {})
cnt: 4, ((T([64, 512, 28, 28], f16), T([64, 128, 28, 28], f16), T([512, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16), T([128, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([64, 128, 28, 28], f16), T([64, 512, 28, 28], f16), T([128, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 512, 28, 28], f16), T([64, 256, 28, 28], f16), T([512, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 28, 28], f16), T([64, 128, 56, 56], f16), T([128, 128, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 256, 56, 56], f16), T([128, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([64, 1, 256], f16), T([64, 1, 256], f16), T([1, 1, 5], f16), [0], [1], [2], [1], False, [0], 1, [True, True, False]), {})
cnt: 4, ((T([64, 256, 56, 56], f16), T([64, 64, 56, 56], f16), T([256, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16), T([64, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 64, 56, 56], f16), T([64, 256, 56, 56], f16), T([64, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16), T([64, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 112, 112], f16), T([64, 32, 112, 112], f16), T([64, 32, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([64, 32, 112, 112], f16), T([32, 32, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([64, 3, 224, 224], f16), T([32, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([64, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 4, ((T([64, 2048, 7, 7], f16, stride=(2048, 1, 0, 0)), 49), {})
cnt: 23, ((T([64, 1024, 14, 14], f16, stride=(1024, 1, 0, 0)), 196), {})
cnt: 4, ((T([64, 512, 28, 28], f16, stride=(512, 1, 0, 0)), 784), {})
cnt: 3, ((T([64, 256, 56, 56], f16, stride=(256, 1, 0, 0)), 3136), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([64], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([64, 64, 112, 112], f16), [3, 3], [2, 2], [1, 1]), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([64, 64, 56, 56], f16), T([64, 64, 112, 112], f16), [3, 3], [2, 2], [1, 1], [1, 1], False, T([64, 64, 56, 56], i64)), {})
Operator: aten.mean.dim
cnt: 3, ((T([64, 256, 56, 56], f16), [2, 3]), {})
cnt: 4, ((T([64, 512, 28, 28], f16), [2, 3]), {})
cnt: 23, ((T([64, 1024, 14, 14], f16), [2, 3]), {})
cnt: 3, ((T([64, 2048, 7, 7], f16), [2, 3]), {})
cnt: 1, ((T([64, 2048, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([64, 1000], f16), T([1000, 2048], f16)), {})
cnt: 1, ((T([1000, 64], f16, stride=(1, 1000)), T([64, 2048], f16)), {})
Operator: aten.mul.Tensor
cnt: 6, ((T([64, 256, 56, 56], f16), T([64, 256, 56, 56], f16, stride=(256, 1, 0, 0))), {})
cnt: 8, ((T([64, 512, 28, 28], f16), T([64, 512, 28, 28], f16, stride=(512, 1, 0, 0))), {})
cnt: 46, ((T([64, 1024, 14, 14], f16), T([64, 1024, 14, 14], f16, stride=(1024, 1, 0, 0))), {})
cnt: 6, ((T([64, 2048, 7, 7], f16), T([64, 2048, 7, 7], f16, stride=(2048, 1, 0, 0))), {})
cnt: 3, ((T([64, 2048, 7, 7], f16), T([64, 2048, 7, 7], f16)), {})
cnt: 23, ((T([64, 1024, 14, 14], f16), T([64, 1024, 14, 14], f16)), {})
cnt: 4, ((T([64, 512, 28, 28], f16), T([64, 512, 28, 28], f16)), {})
cnt: 3, ((T([64, 256, 56, 56], f16), T([64, 256, 56, 56], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 2, ((T([64, 32, 112, 112], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 6, ((T([64, 64, 56, 56], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([64, 256, 56, 56], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 7, ((T([64, 128, 28, 28], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 5, ((T([64, 512, 28, 28], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 45, ((T([64, 256, 14, 14], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 24, ((T([64, 1024, 14, 14], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 5, ((T([64, 512, 7, 7], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([64, 2048, 7, 7], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 4, ((T([64, 2048, 7, 7], f16), T([64, 2048, 7, 7], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f32), T([2048], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 24, ((T([64, 1024, 14, 14], f16), T([64, 1024, 14, 14], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 45, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([64, 512, 28, 28], f16), T([64, 512, 28, 28], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 7, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([64, 256, 56, 56], f16), T([64, 256, 56, 56], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 6, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 64, 112, 112], f16), T([64, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([64, 32, 112, 112], f16), T([64, 32, 112, 112], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([64, 1000], f16), T([64], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([64, 1000], f16), T([64], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 2, ((T([64, 32, 112, 112], f16),), {})
cnt: 1, ((T([64, 64, 112, 112], f16),), {})
cnt: 6, ((T([64, 64, 56, 56], f16),), {})
cnt: 3, ((T([64, 256, 56, 56], f16),), {})
cnt: 1, ((T([64, 128, 56, 56], f16),), {})
cnt: 7, ((T([64, 128, 28, 28], f16),), {})
cnt: 4, ((T([64, 512, 28, 28], f16),), {})
cnt: 1, ((T([64, 256, 28, 28], f16),), {})
cnt: 45, ((T([64, 256, 14, 14], f16),), {})
cnt: 23, ((T([64, 1024, 14, 14], f16),), {})
cnt: 1, ((T([64, 512, 14, 14], f16),), {})
cnt: 5, ((T([64, 512, 7, 7], f16),), {})
cnt: 3, ((T([64, 2048, 7, 7], f16),), {})
Operator: aten.sigmoid.default
cnt: 3, ((T([64, 1, 256], f16),), {})
cnt: 4, ((T([64, 1, 512], f16),), {})
cnt: 23, ((T([64, 1, 1024], f16),), {})
cnt: 3, ((T([64, 1, 2048], f16),), {})
Operator: aten.sigmoid_backward.default
cnt: 3, ((T([64, 1, 2048], f16), T([64, 1, 2048], f16)), {})
cnt: 23, ((T([64, 1, 1024], f16), T([64, 1, 1024], f16)), {})
cnt: 4, ((T([64, 1, 512], f16), T([64, 1, 512], f16)), {})
cnt: 3, ((T([64, 1, 256], f16), T([64, 1, 256], f16)), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([64, 1000], f16), [0], True), {})
cnt: 3, ((T([64, 2048, 7, 7], f16), [2, 3], True), {})
cnt: 23, ((T([64, 1024, 14, 14], f16), [2, 3], True), {})
cnt: 4, ((T([64, 512, 28, 28], f16), [2, 3], True), {})
cnt: 3, ((T([64, 256, 56, 56], f16), [2, 3], True), {})
Operator: aten.threshold_backward.default
cnt: 3, ((T([64, 2048, 7, 7], f16), T([64, 2048, 7, 7], f16), 0), {})
cnt: 5, ((T([64, 512, 7, 7], f16), T([64, 512, 7, 7], f16), 0), {})
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 512, 14, 14], f16), 0), {})
cnt: 23, ((T([64, 1024, 14, 14], f16), T([64, 1024, 14, 14], f16), 0), {})
cnt: 45, ((T([64, 256, 14, 14], f16), T([64, 256, 14, 14], f16), 0), {})
cnt: 1, ((T([64, 256, 28, 28], f16), T([64, 256, 28, 28], f16), 0), {})
cnt: 4, ((T([64, 512, 28, 28], f16), T([64, 512, 28, 28], f16), 0), {})
cnt: 7, ((T([64, 128, 28, 28], f16), T([64, 128, 28, 28], f16), 0), {})
cnt: 1, ((T([64, 128, 56, 56], f16), T([64, 128, 56, 56], f16), 0), {})
cnt: 3, ((T([64, 256, 56, 56], f16), T([64, 256, 56, 56], f16), 0), {})
cnt: 6, ((T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16), 0), {})
cnt: 1, ((T([64, 64, 112, 112], f16), T([64, 64, 112, 112], f16), 0), {})
cnt: 2, ((T([64, 32, 112, 112], f16), T([64, 32, 112, 112], f16), 0), {})

View File

@ -0,0 +1,182 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 23, ((T([], i64), 1), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), T([128, 1024, 7, 7], f16)), {})
cnt: 2, ((T([128, 224, 7, 7], f16, stride=(70560, 49, 7, 1)), T([128, 224, 7, 7], f16)), {})
cnt: 1, ((T([128, 768, 7, 7], f16, stride=(70560, 49, 7, 1)), T([128, 768, 7, 7], f16)), {})
cnt: 1, ((T([128, 768, 14, 14], f16), T([128, 768, 14, 14], f16)), {})
cnt: 2, ((T([128, 192, 14, 14], f16, stride=(213248, 196, 14, 1)), T([128, 192, 14, 14], f16)), {})
cnt: 1, ((T([128, 512, 14, 14], f16, stride=(213248, 196, 14, 1)), T([128, 512, 14, 14], f16)), {})
cnt: 1, ((T([128, 512, 28, 28], f16), T([128, 512, 28, 28], f16)), {})
cnt: 2, ((T([128, 160, 28, 28], f16, stride=(577024, 784, 28, 1)), T([128, 160, 28, 28], f16)), {})
cnt: 1, ((T([128, 256, 28, 28], f16, stride=(577024, 784, 28, 1)), T([128, 256, 28, 28], f16)), {})
cnt: 1, ((T([128, 256, 56, 56], f16), T([128, 256, 56, 56], f16)), {})
cnt: 2, ((T([128, 128, 56, 56], f16, stride=(1404928, 3136, 56, 1)), T([128, 128, 56, 56], f16)), {})
cnt: 1, ((T([128, 64, 56, 56], f16, stride=(1404928, 3136, 56, 1)), T([128, 64, 56, 56], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 1024], f16), T([1024, 1000], f16, stride=(1, 1024))), {})
Operator: aten.cat.default
cnt: 1, (([T([128, 64, 56, 56], f16), T([128, 128, 56, 56], f16), T([128, 128, 56, 56], f16), T([128, 128, 56, 56], f16)], 1), {})
cnt: 1, (([T([128, 256, 28, 28], f16), T([128, 160, 28, 28], f16), T([128, 160, 28, 28], f16), T([128, 160, 28, 28], f16)], 1), {})
cnt: 1, (([T([128, 512, 14, 14], f16), T([128, 192, 14, 14], f16), T([128, 192, 14, 14], f16), T([128, 192, 14, 14], f16)], 1), {})
cnt: 1, (([T([128, 768, 7, 7], f16), T([128, 224, 7, 7], f16), T([128, 224, 7, 7], f16), T([128, 224, 7, 7], f16)], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([64, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([64, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([64, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([64, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([64, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([128, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 128, 56, 56], f16), T([128, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 128), {})
cnt: 3, ((T([128, 128, 56, 56], f16), T([128, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 448, 56, 56], f16), T([256, 448, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 1, 1], f16), T([256, 256, 1, 1], f16), T([256], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 28, 28], f16), T([160, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 160, 28, 28], f16), T([160, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 160), {})
cnt: 3, ((T([128, 160, 28, 28], f16), T([160, 160, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 736, 28, 28], f16), T([512, 736, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 1, 1], f16), T([512, 512, 1, 1], f16), T([512], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 14, 14], f16), T([192, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 192, 14, 14], f16), T([192, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 192), {})
cnt: 3, ((T([128, 192, 14, 14], f16), T([192, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1088, 14, 14], f16), T([768, 1088, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 768, 1, 1], f16), T([768, 768, 1, 1], f16), T([768], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 768, 7, 7], f16), T([224, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 224, 7, 7], f16), T([224, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 224), {})
cnt: 3, ((T([128, 224, 7, 7], f16), T([224, 224, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1440, 7, 7], f16), T([1024, 1440, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1024, 1, 1], f16), T([1024, 1024, 1, 1], f16), T([1024], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 1024, 1, 1], f16), T([128, 1024, 1, 1], f16), T([1024, 1024, 1, 1], f16), [1024], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), T([128, 1440, 7, 7], f16), T([1024, 1440, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 224, 7, 7], f16), T([128, 224, 7, 7], f16), T([224, 224, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 224, 7, 7], f16), T([128, 224, 7, 7], f16), T([224, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 224, [True, True, False]), {})
cnt: 1, ((T([128, 224, 7, 7], f16), T([128, 768, 7, 7], f16), T([224, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 768, 1, 1], f16), T([128, 768, 1, 1], f16), T([768, 768, 1, 1], f16), [768], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 768, 14, 14], f16), T([128, 1088, 14, 14], f16), T([768, 1088, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 192, 14, 14], f16), T([128, 192, 14, 14], f16), T([192, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 192, 14, 14], f16), T([128, 192, 14, 14], f16), T([192, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 192, [True, True, False]), {})
cnt: 1, ((T([128, 192, 14, 14], f16), T([128, 512, 14, 14], f16), T([192, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 512, 1, 1], f16), T([128, 512, 1, 1], f16), T([512, 512, 1, 1], f16), [512], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 512, 28, 28], f16), T([128, 736, 28, 28], f16), T([512, 736, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 160, 28, 28], f16), T([128, 160, 28, 28], f16), T([160, 160, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 160, 28, 28], f16), T([128, 160, 28, 28], f16), T([160, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 160, [True, True, False]), {})
cnt: 1, ((T([128, 160, 28, 28], f16), T([128, 256, 28, 28], f16), T([160, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 1, 1], f16), T([128, 256, 1, 1], f16), T([256, 256, 1, 1], f16), [256], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 256, 56, 56], f16), T([128, 448, 56, 56], f16), T([256, 448, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 128, 56, 56], f16), T([128, 128, 56, 56], f16), T([128, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 128, 56, 56], f16), T([128, 128, 56, 56], f16), T([128, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 128, [True, True, False]), {})
cnt: 1, ((T([128, 128, 56, 56], f16), T([128, 64, 56, 56], f16), T([128, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([128, 64, 56, 56], f16), T([64, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([128, 64, 112, 112], f16), T([64, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([128, 64, 112, 112], f16), T([64, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([128, 64, 112, 112], f16), T([64, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([128, 3, 224, 224], f16), T([64, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([128, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 2, ((T([128, 1024, 7, 7], f16, stride=(1024, 1, 0, 0)), 49), {})
cnt: 1, ((T([128, 768, 14, 14], f16, stride=(768, 1, 0, 0)), 196), {})
cnt: 1, ((T([128, 512, 28, 28], f16, stride=(512, 1, 0, 0)), 784), {})
cnt: 1, ((T([128, 256, 56, 56], f16, stride=(256, 1, 0, 0)), 3136), {})
Operator: aten.hardsigmoid.default
cnt: 1, ((T([128, 256, 1, 1], f16),), {})
cnt: 1, ((T([128, 512, 1, 1], f16),), {})
cnt: 1, ((T([128, 768, 1, 1], f16),), {})
cnt: 1, ((T([128, 1024, 1, 1], f16),), {})
Operator: aten.hardsigmoid_backward.default
cnt: 1, ((T([128, 1024, 1, 1], f16), T([128, 1024, 1, 1], f16)), {})
cnt: 1, ((T([128, 768, 1, 1], f16), T([128, 768, 1, 1], f16)), {})
cnt: 1, ((T([128, 512, 1, 1], f16), T([128, 512, 1, 1], f16)), {})
cnt: 1, ((T([128, 256, 1, 1], f16), T([128, 256, 1, 1], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([128, 256, 56, 56], f16), [3, 3], [2, 2], [0, 0], [1, 1], True), {})
cnt: 1, ((T([128, 512, 28, 28], f16), [3, 3], [2, 2], [0, 0], [1, 1], True), {})
cnt: 1, ((T([128, 768, 14, 14], f16), [3, 3], [2, 2], [0, 0], [1, 1], True), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([128, 768, 7, 7], f16), T([128, 768, 14, 14], f16), [3, 3], [2, 2], [0, 0], [1, 1], True, T([128, 768, 7, 7], i64)), {})
cnt: 1, ((T([128, 512, 14, 14], f16), T([128, 512, 28, 28], f16), [3, 3], [2, 2], [0, 0], [1, 1], True, T([128, 512, 14, 14], i64)), {})
cnt: 1, ((T([128, 256, 28, 28], f16), T([128, 256, 56, 56], f16), [3, 3], [2, 2], [0, 0], [1, 1], True, T([128, 256, 28, 28], i64)), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 256, 56, 56], f16), [2, 3], True), {})
cnt: 1, ((T([128, 512, 28, 28], f16), [2, 3], True), {})
cnt: 1, ((T([128, 768, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 1024], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 1024], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([128, 256, 56, 56], f16), T([128, 256, 1, 1], f16)), {})
cnt: 2, ((T([128, 512, 28, 28], f16), T([128, 512, 1, 1], f16)), {})
cnt: 2, ((T([128, 768, 14, 14], f16), T([128, 768, 1, 1], f16)), {})
cnt: 2, ((T([128, 1024, 7, 7], f16), T([128, 1024, 1, 1], f16)), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), T([128, 1024, 7, 7], f16)), {})
cnt: 1, ((T([128, 768, 14, 14], f16), T([128, 768, 14, 14], f16)), {})
cnt: 1, ((T([128, 512, 28, 28], f16), T([128, 512, 28, 28], f16)), {})
cnt: 1, ((T([128, 256, 56, 56], f16), T([128, 256, 56, 56], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 2, ((T([128, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 256, 56, 56], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 160, 28, 28], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 512, 28, 28], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 192, 14, 14], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 768, 14, 14], f16), T([768], f16), T([768], f16), T([768], f16), T([768], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 224, 7, 7], f16), T([224], f16), T([224], f16), T([224], f16), T([224], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([128, 1024, 7, 7], f16), T([128, 1024, 7, 7], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 224, 7, 7], f16), T([128, 224, 7, 7], f16), T([224], f16), T([224], f16), T([224], f16), T([224], f32), T([224], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 768, 14, 14], f16), T([128, 768, 14, 14], f16), T([768], f16), T([768], f16), T([768], f16), T([768], f32), T([768], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 192, 14, 14], f16), T([128, 192, 14, 14], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 512, 28, 28], f16), T([128, 512, 28, 28], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 160, 28, 28], f16), T([128, 160, 28, 28], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f32), T([160], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 256, 56, 56], f16), T([128, 256, 56, 56], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 128, 56, 56], f16), T([128, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([128, 64, 56, 56], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 64, 112, 112], f16), T([128, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 2, ((T([128, 64, 112, 112], f16),), {})
cnt: 1, ((T([128, 64, 56, 56], f16),), {})
cnt: 4, ((T([128, 128, 56, 56], f16),), {})
cnt: 1, ((T([128, 256, 56, 56], f16),), {})
cnt: 4, ((T([128, 160, 28, 28], f16),), {})
cnt: 1, ((T([128, 512, 28, 28], f16),), {})
cnt: 4, ((T([128, 192, 14, 14], f16),), {})
cnt: 1, ((T([128, 768, 14, 14], f16),), {})
cnt: 4, ((T([128, 224, 7, 7], f16),), {})
cnt: 1, ((T([128, 1024, 7, 7], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 768, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([128, 512, 28, 28], f16), [2, 3], True), {})
cnt: 1, ((T([128, 256, 56, 56], f16), [2, 3], True), {})
Operator: aten.threshold_backward.default
cnt: 1, ((T([128, 1024, 7, 7], f16), T([128, 1024, 7, 7], f16), 0), {})
cnt: 1, ((T([128, 224, 7, 7], f16, stride=(70560, 49, 7, 1)), T([128, 224, 7, 7], f16), 0), {})
cnt: 3, ((T([128, 224, 7, 7], f16), T([128, 224, 7, 7], f16), 0), {})
cnt: 1, ((T([128, 768, 14, 14], f16), T([128, 768, 14, 14], f16), 0), {})
cnt: 1, ((T([128, 192, 14, 14], f16, stride=(213248, 196, 14, 1)), T([128, 192, 14, 14], f16), 0), {})
cnt: 3, ((T([128, 192, 14, 14], f16), T([128, 192, 14, 14], f16), 0), {})
cnt: 1, ((T([128, 512, 28, 28], f16), T([128, 512, 28, 28], f16), 0), {})
cnt: 1, ((T([128, 160, 28, 28], f16, stride=(577024, 784, 28, 1)), T([128, 160, 28, 28], f16), 0), {})
cnt: 3, ((T([128, 160, 28, 28], f16), T([128, 160, 28, 28], f16), 0), {})
cnt: 1, ((T([128, 256, 56, 56], f16), T([128, 256, 56, 56], f16), 0), {})
cnt: 1, ((T([128, 128, 56, 56], f16, stride=(1404928, 3136, 56, 1)), T([128, 128, 56, 56], f16), 0), {})
cnt: 3, ((T([128, 128, 56, 56], f16), T([128, 128, 56, 56], f16), 0), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([128, 64, 56, 56], f16), 0), {})
cnt: 2, ((T([128, 64, 112, 112], f16), T([128, 64, 112, 112], f16), 0), {})

View File

@ -0,0 +1,189 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 65, ((T([], i64), 1), {})
cnt: 2, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16)), {})
cnt: 4, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16)), {})
cnt: 6, ((T([128, 32, 28, 28], f16), T([128, 32, 28, 28], f16)), {})
cnt: 6, ((T([128, 64, 14, 14], f16), T([128, 64, 14, 14], f16)), {})
cnt: 6, ((T([128, 112, 14, 14], f16), T([128, 112, 14, 14], f16)), {})
cnt: 6, ((T([128, 184, 7, 7], f16), T([128, 184, 7, 7], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 1984], f16), T([1984, 1000], f16, stride=(1, 1984))), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([16, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 16, 112, 112], f16), T([16, 16, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([16, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 16), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([96, 16, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 96, 112, 112], f16), T([96, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 96), {})
cnt: 1, ((T([128, 96, 56, 56], f16), T([24, 96, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 24, 56, 56], f16), T([24, 24, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 24, 56, 56], f16), T([24, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 24), {})
cnt: 1, ((T([128, 24, 56, 56], f16), T([144, 24, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 144, 56, 56], f16), T([144, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 144), {})
cnt: 1, ((T([128, 144, 28, 28], f16), T([32, 144, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 28, 28], f16), T([96, 32, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 96, 28, 28], f16), T([96, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 96), {})
cnt: 1, ((T([128, 96, 28, 28], f16), T([32, 96, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 32, 28, 28], f16), T([192, 32, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 28, 28], f16), T([192, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 192), {})
cnt: 2, ((T([128, 192, 28, 28], f16), T([32, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 28, 28], f16), T([192, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 192), {})
cnt: 1, ((T([128, 192, 28, 28], f16), T([192, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 192), {})
cnt: 2, ((T([128, 192, 14, 14], f16), T([64, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 14, 14], f16), T([192, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 14, 14], f16), T([192, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 192), {})
cnt: 3, ((T([128, 64, 14, 14], f16), T([384, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 384, 14, 14], f16), T([384, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 384), {})
cnt: 2, ((T([128, 384, 14, 14], f16), T([64, 384, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 384, 14, 14], f16), T([112, 384, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 112, 14, 14], f16), T([672, 112, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 672, 14, 14], f16), T([672, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 672), {})
cnt: 2, ((T([128, 672, 14, 14], f16), T([112, 672, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 112, 14, 14], f16), T([336, 112, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 336, 14, 14], f16), T([336, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 336), {})
cnt: 1, ((T([128, 336, 14, 14], f16), T([112, 336, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 672, 14, 14], f16), T([672, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 672), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([184, 672, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 184, 7, 7], f16), T([1104, 184, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 1104, 7, 7], f16), T([1104, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 1104), {})
cnt: 3, ((T([128, 1104, 7, 7], f16), T([184, 1104, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1104, 7, 7], f16), T([1104, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1104), {})
cnt: 1, ((T([128, 1104, 7, 7], f16), T([352, 1104, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 352, 7, 7], f16), T([1984, 352, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 1984, 7, 7], f16), T([128, 352, 7, 7], f16), T([1984, 352, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 352, 7, 7], f16), T([128, 1104, 7, 7], f16), T([352, 1104, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1104, 7, 7], f16), T([128, 1104, 7, 7], f16), T([1104, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1104, [True, True, False]), {})
cnt: 4, ((T([128, 1104, 7, 7], f16), T([128, 184, 7, 7], f16), T([1104, 184, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 184, 7, 7], f16), T([128, 1104, 7, 7], f16), T([184, 1104, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 1104, 7, 7], f16), T([128, 1104, 7, 7], f16), T([1104, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 1104, [True, True, False]), {})
cnt: 1, ((T([128, 184, 7, 7], f16), T([128, 672, 7, 7], f16), T([184, 672, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([128, 672, 14, 14], f16), T([672, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 672, [True, True, False]), {})
cnt: 3, ((T([128, 672, 14, 14], f16), T([128, 112, 14, 14], f16), T([672, 112, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 112, 14, 14], f16), T([128, 336, 14, 14], f16), T([112, 336, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 336, 14, 14], f16), T([128, 336, 14, 14], f16), T([336, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 336, [True, True, False]), {})
cnt: 1, ((T([128, 336, 14, 14], f16), T([128, 112, 14, 14], f16), T([336, 112, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 112, 14, 14], f16), T([128, 672, 14, 14], f16), T([112, 672, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 672, 14, 14], f16), T([128, 672, 14, 14], f16), T([672, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 672, [True, True, False]), {})
cnt: 1, ((T([128, 112, 14, 14], f16), T([128, 384, 14, 14], f16), T([112, 384, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 384, 14, 14], f16), T([128, 384, 14, 14], f16), T([384, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 384, [True, True, False]), {})
cnt: 3, ((T([128, 384, 14, 14], f16), T([128, 64, 14, 14], f16), T([384, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 64, 14, 14], f16), T([128, 384, 14, 14], f16), T([64, 384, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 64, 14, 14], f16), T([128, 192, 14, 14], f16), T([64, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 14, 14], f16), T([128, 192, 14, 14], f16), T([192, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 192, [True, True, False]), {})
cnt: 1, ((T([128, 192, 14, 14], f16), T([128, 64, 14, 14], f16), T([192, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 14, 14], f16), T([128, 192, 28, 28], f16), T([192, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 192, [True, True, False]), {})
cnt: 3, ((T([128, 192, 28, 28], f16), T([128, 32, 28, 28], f16), T([192, 32, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 32, 28, 28], f16), T([128, 192, 28, 28], f16), T([32, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 28, 28], f16), T([128, 192, 28, 28], f16), T([192, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 192, [True, True, False]), {})
cnt: 1, ((T([128, 192, 28, 28], f16), T([128, 192, 28, 28], f16), T([192, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 192, [True, True, False]), {})
cnt: 1, ((T([128, 32, 28, 28], f16), T([128, 96, 28, 28], f16), T([32, 96, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 96, 28, 28], f16), T([128, 96, 28, 28], f16), T([96, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 96, [True, True, False]), {})
cnt: 1, ((T([128, 96, 28, 28], f16), T([128, 32, 28, 28], f16), T([96, 32, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 28, 28], f16), T([128, 144, 28, 28], f16), T([32, 144, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 144, 28, 28], f16), T([128, 144, 56, 56], f16), T([144, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 144, [True, True, False]), {})
cnt: 1, ((T([128, 144, 56, 56], f16), T([128, 24, 56, 56], f16), T([144, 24, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16), T([24, 24, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16), T([24, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 24, [True, True, False]), {})
cnt: 1, ((T([128, 24, 56, 56], f16), T([128, 96, 56, 56], f16), T([24, 96, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 96, 56, 56], f16), T([128, 96, 112, 112], f16), T([96, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 96, [True, True, False]), {})
cnt: 1, ((T([128, 96, 112, 112], f16), T([128, 16, 112, 112], f16), T([96, 16, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), T([16, 16, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), T([16, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 16, [True, True, False]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 3, 224, 224], f16), T([16, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([128, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 1984, 7, 7], f16, stride=(1984, 1, 0, 0)), 49), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 1984, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 1984], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 1984], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 4, ((T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 96, 112, 112], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 96, 56, 56], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f16), True, 0.1, 1e-05), {})
cnt: 7, ((T([128, 24, 56, 56], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 144, 56, 56], f16), T([144], f16), T([144], f16), T([144], f16), T([144], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 144, 28, 28], f16), T([144], f16), T([144], f16), T([144], f16), T([144], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 32, 28, 28], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 96, 28, 28], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f16), True, 0.1, 1e-05), {})
cnt: 5, ((T([128, 192, 28, 28], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 192, 14, 14], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 64, 14, 14], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 6, ((T([128, 384, 14, 14], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 112, 14, 14], f16), T([112], f16), T([112], f16), T([112], f16), T([112], f16), True, 0.1, 1e-05), {})
cnt: 5, ((T([128, 672, 14, 14], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 336, 14, 14], f16), T([336], f16), T([336], f16), T([336], f16), T([336], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 184, 7, 7], f16), T([184], f16), T([184], f16), T([184], f16), T([184], f16), True, 0.1, 1e-05), {})
cnt: 8, ((T([128, 1104, 7, 7], f16), T([1104], f16), T([1104], f16), T([1104], f16), T([1104], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 352, 7, 7], f16), T([352], f16), T([352], f16), T([352], f16), T([352], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 1984, 7, 7], f16), T([1984], f16), T([1984], f16), T([1984], f16), T([1984], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([128, 1984, 7, 7], f16), T([128, 1984, 7, 7], f16), T([1984], f16), T([1984], f16), T([1984], f16), T([1984], f32), T([1984], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 352, 7, 7], f16), T([128, 352, 7, 7], f16), T([352], f16), T([352], f16), T([352], f16), T([352], f32), T([352], f32), True, 1e-05, [True, True, True]), {})
cnt: 8, ((T([128, 1104, 7, 7], f16), T([128, 1104, 7, 7], f16), T([1104], f16), T([1104], f16), T([1104], f16), T([1104], f32), T([1104], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 184, 7, 7], f16), T([128, 184, 7, 7], f16), T([184], f16), T([184], f16), T([184], f16), T([184], f32), T([184], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([128, 672, 7, 7], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f32), T([672], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([128, 672, 14, 14], f16), T([128, 672, 14, 14], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f32), T([672], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 112, 14, 14], f16), T([128, 112, 14, 14], f16), T([112], f16), T([112], f16), T([112], f16), T([112], f32), T([112], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 336, 14, 14], f16), T([128, 336, 14, 14], f16), T([336], f16), T([336], f16), T([336], f16), T([336], f32), T([336], f32), True, 1e-05, [True, True, True]), {})
cnt: 6, ((T([128, 384, 14, 14], f16), T([128, 384, 14, 14], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f32), T([384], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 64, 14, 14], f16), T([128, 64, 14, 14], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 192, 14, 14], f16), T([128, 192, 14, 14], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([128, 192, 28, 28], f16), T([128, 192, 28, 28], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 32, 28, 28], f16), T([128, 32, 28, 28], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 96, 28, 28], f16), T([128, 96, 28, 28], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f32), T([96], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 144, 28, 28], f16), T([128, 144, 28, 28], f16), T([144], f16), T([144], f16), T([144], f16), T([144], f32), T([144], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 144, 56, 56], f16), T([128, 144, 56, 56], f16), T([144], f16), T([144], f16), T([144], f16), T([144], f32), T([144], f32), True, 1e-05, [True, True, True]), {})
cnt: 7, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f32), T([24], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 96, 56, 56], f16), T([128, 96, 56, 56], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f32), T([96], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 96, 112, 112], f16), T([128, 96, 112, 112], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f32), T([96], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f32), T([16], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 3, ((T([128, 16, 112, 112], f16),), {})
cnt: 1, ((T([128, 96, 112, 112], f16),), {})
cnt: 1, ((T([128, 96, 56, 56], f16),), {})
cnt: 4, ((T([128, 24, 56, 56], f16),), {})
cnt: 1, ((T([128, 144, 56, 56], f16),), {})
cnt: 1, ((T([128, 144, 28, 28], f16),), {})
cnt: 2, ((T([128, 96, 28, 28], f16),), {})
cnt: 5, ((T([128, 192, 28, 28], f16),), {})
cnt: 3, ((T([128, 192, 14, 14], f16),), {})
cnt: 6, ((T([128, 384, 14, 14], f16),), {})
cnt: 5, ((T([128, 672, 14, 14], f16),), {})
cnt: 2, ((T([128, 336, 14, 14], f16),), {})
cnt: 1, ((T([128, 672, 7, 7], f16),), {})
cnt: 8, ((T([128, 1104, 7, 7], f16),), {})
cnt: 1, ((T([128, 1984, 7, 7], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 1, ((T([128, 1984, 7, 7], f16), T([128, 1984, 7, 7], f16), 0), {})
cnt: 8, ((T([128, 1104, 7, 7], f16), T([128, 1104, 7, 7], f16), 0), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([128, 672, 7, 7], f16), 0), {})
cnt: 5, ((T([128, 672, 14, 14], f16), T([128, 672, 14, 14], f16), 0), {})
cnt: 2, ((T([128, 336, 14, 14], f16), T([128, 336, 14, 14], f16), 0), {})
cnt: 6, ((T([128, 384, 14, 14], f16), T([128, 384, 14, 14], f16), 0), {})
cnt: 3, ((T([128, 192, 14, 14], f16), T([128, 192, 14, 14], f16), 0), {})
cnt: 5, ((T([128, 192, 28, 28], f16), T([128, 192, 28, 28], f16), 0), {})
cnt: 2, ((T([128, 96, 28, 28], f16), T([128, 96, 28, 28], f16), 0), {})
cnt: 1, ((T([128, 144, 28, 28], f16), T([128, 144, 28, 28], f16), 0), {})
cnt: 1, ((T([128, 144, 56, 56], f16), T([128, 144, 56, 56], f16), 0), {})
cnt: 4, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16), 0), {})
cnt: 1, ((T([128, 96, 56, 56], f16), T([128, 96, 56, 56], f16), 0), {})
cnt: 1, ((T([128, 96, 112, 112], f16), T([128, 96, 112, 112], f16), 0), {})
cnt: 3, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), 0), {})

View File

@ -0,0 +1,287 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 87, ((T([], i64), 1), {})
cnt: 4, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16)), {})
cnt: 6, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16)), {})
cnt: 8, ((T([128, 40, 28, 28], f16), T([128, 40, 28, 28], f16)), {})
cnt: 8, ((T([128, 72, 14, 14], f16), T([128, 72, 14, 14], f16)), {})
cnt: 10, ((T([128, 120, 14, 14], f16), T([128, 120, 14, 14], f16)), {})
cnt: 10, ((T([128, 184, 7, 7], f16), T([128, 184, 7, 7], f16)), {})
cnt: 1, ((T([128, 1104, 7, 7], f16), T([128, 1104, 7, 7], f16)), {})
cnt: 5, ((T([128, 736, 7, 7], f16), T([128, 736, 7, 7], f16)), {})
cnt: 1, ((T([128, 720, 7, 7], f16), T([128, 720, 7, 7], f16)), {})
cnt: 6, ((T([128, 360, 14, 14], f16), T([128, 360, 14, 14], f16)), {})
cnt: 5, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 1984], f16), T([1984, 1000], f16, stride=(1, 1984))), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 224, 224], f16),), {})
cnt: 3, ((T([128, 16, 112, 112], f16),), {})
cnt: 1, ((T([128, 64, 112, 112], f16),), {})
cnt: 1, ((T([128, 64, 56, 56], f16),), {})
cnt: 6, ((T([128, 48, 56, 56], f16),), {})
cnt: 1, ((T([128, 120, 56, 56], f16),), {})
cnt: 9, ((T([128, 120, 28, 28], f16),), {})
cnt: 1, ((T([128, 8, 1, 1], f16),), {})
cnt: 4, ((T([128, 16, 1, 1], f16),), {})
cnt: 1, ((T([128, 200, 28, 28], f16),), {})
cnt: 1, ((T([128, 200, 14, 14], f16),), {})
cnt: 8, ((T([128, 216, 14, 14], f16),), {})
cnt: 12, ((T([128, 360, 14, 14], f16),), {})
cnt: 1, ((T([128, 24, 1, 1], f16),), {})
cnt: 6, ((T([128, 32, 1, 1], f16),), {})
cnt: 1, ((T([128, 720, 14, 14], f16),), {})
cnt: 1, ((T([128, 720, 7, 7], f16),), {})
cnt: 10, ((T([128, 736, 7, 7], f16),), {})
cnt: 6, ((T([128, 48, 1, 1], f16),), {})
cnt: 2, ((T([128, 1104, 7, 7], f16),), {})
cnt: 1, ((T([128, 1344, 7, 7], f16),), {})
cnt: 1, ((T([128, 1984, 1, 1], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([16, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 16, 112, 112], f16), T([16, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 16), {})
cnt: 2, ((T([128, 16, 112, 112], f16), T([16, 16, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([64, 16, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([64, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 64), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([24, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 24, 56, 56], f16), T([48, 24, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 48, 56, 56], f16), T([48, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 48), {})
cnt: 3, ((T([128, 48, 56, 56], f16), T([24, 48, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 24, 56, 56], f16), T([120, 24, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 120, 56, 56], f16), T([120, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 120), {})
cnt: 1, ((T([128, 120, 1, 1], f16), T([8, 120, 1, 1], f16), T([8], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 8, 1, 1], f16), T([120, 8, 1, 1], f16), T([120], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 120, 28, 28], f16), T([40, 120, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 40, 28, 28], f16), T([120, 40, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 120, 28, 28], f16), T([120, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 120), {})
cnt: 4, ((T([128, 120, 1, 1], f16), T([16, 120, 1, 1], f16), T([16], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 16, 1, 1], f16), T([120, 16, 1, 1], f16), T([120], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 40, 28, 28], f16), T([200, 40, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 200, 28, 28], f16), T([200, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 200), {})
cnt: 1, ((T([128, 200, 14, 14], f16), T([72, 200, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 72, 14, 14], f16), T([216, 72, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 216, 14, 14], f16), T([216, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 216), {})
cnt: 4, ((T([128, 216, 14, 14], f16), T([72, 216, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 72, 14, 14], f16), T([360, 72, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 360, 14, 14], f16), T([360, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 360), {})
cnt: 1, ((T([128, 360, 1, 1], f16), T([24, 360, 1, 1], f16), T([24], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 24, 1, 1], f16), T([360, 24, 1, 1], f16), T([360], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 6, ((T([128, 360, 14, 14], f16), T([120, 360, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 120, 14, 14], f16), T([360, 120, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 360, 14, 14], f16), T([360, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 360), {})
cnt: 5, ((T([128, 360, 1, 1], f16), T([32, 360, 1, 1], f16), T([32], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 32, 1, 1], f16), T([360, 32, 1, 1], f16), T([360], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 120, 14, 14], f16), T([720, 120, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 720, 14, 14], f16), T([720, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 720), {})
cnt: 1, ((T([128, 720, 1, 1], f16), T([32, 720, 1, 1], f16), T([32], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 1, 1], f16), T([720, 32, 1, 1], f16), T([720], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 720, 7, 7], f16), T([184, 720, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 184, 7, 7], f16), T([736, 184, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 736, 7, 7], f16), T([736, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 736), {})
cnt: 5, ((T([128, 736, 1, 1], f16), T([48, 736, 1, 1], f16), T([48], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 48, 1, 1], f16), T([736, 48, 1, 1], f16), T([736], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 736, 7, 7], f16), T([184, 736, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 184, 7, 7], f16), T([1104, 184, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1104, 7, 7], f16), T([1104, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 1104), {})
cnt: 1, ((T([128, 1104, 1, 1], f16), T([48, 1104, 1, 1], f16), T([48], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 48, 1, 1], f16), T([1104, 48, 1, 1], f16), T([1104], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1104, 7, 7], f16), T([224, 1104, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 224, 7, 7], f16), T([1344, 224, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1344, 1, 1], f16), T([1984, 1344, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 1984, 1, 1], f16), T([128, 1344, 1, 1], f16), T([1984, 1344, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1344, 7, 7], f16), T([128, 224, 7, 7], f16), T([1344, 224, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 224, 7, 7], f16), T([128, 1104, 7, 7], f16), T([224, 1104, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1104, 1, 1], f16), T([128, 48, 1, 1], f16), T([1104, 48, 1, 1], f16), [1104], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 48, 1, 1], f16), T([128, 1104, 1, 1], f16), T([48, 1104, 1, 1], f16), [48], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 1104, 7, 7], f16), T([128, 1104, 7, 7], f16), T([1104, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 1104, [True, True, False]), {})
cnt: 1, ((T([128, 1104, 7, 7], f16), T([128, 184, 7, 7], f16), T([1104, 184, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 5, ((T([128, 184, 7, 7], f16), T([128, 736, 7, 7], f16), T([184, 736, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 5, ((T([128, 736, 1, 1], f16), T([128, 48, 1, 1], f16), T([736, 48, 1, 1], f16), [736], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 5, ((T([128, 48, 1, 1], f16), T([128, 736, 1, 1], f16), T([48, 736, 1, 1], f16), [48], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 5, ((T([128, 736, 7, 7], f16), T([128, 736, 7, 7], f16), T([736, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 736, [True, True, False]), {})
cnt: 5, ((T([128, 736, 7, 7], f16), T([128, 184, 7, 7], f16), T([736, 184, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 184, 7, 7], f16), T([128, 720, 7, 7], f16), T([184, 720, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 720, 1, 1], f16), T([128, 32, 1, 1], f16), T([720, 32, 1, 1], f16), [720], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 32, 1, 1], f16), T([128, 720, 1, 1], f16), T([32, 720, 1, 1], f16), [32], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 720, 7, 7], f16), T([128, 720, 14, 14], f16), T([720, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 720, [True, True, False]), {})
cnt: 1, ((T([128, 720, 14, 14], f16), T([128, 120, 14, 14], f16), T([720, 120, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 6, ((T([128, 120, 14, 14], f16), T([128, 360, 14, 14], f16), T([120, 360, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 5, ((T([128, 360, 1, 1], f16), T([128, 32, 1, 1], f16), T([360, 32, 1, 1], f16), [360], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 5, ((T([128, 32, 1, 1], f16), T([128, 360, 1, 1], f16), T([32, 360, 1, 1], f16), [32], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 5, ((T([128, 360, 14, 14], f16), T([128, 360, 14, 14], f16), T([360, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 360, [True, True, False]), {})
cnt: 5, ((T([128, 360, 14, 14], f16), T([128, 120, 14, 14], f16), T([360, 120, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 360, 1, 1], f16), T([128, 24, 1, 1], f16), T([360, 24, 1, 1], f16), [360], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 24, 1, 1], f16), T([128, 360, 1, 1], f16), T([24, 360, 1, 1], f16), [24], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 360, 14, 14], f16), T([128, 360, 14, 14], f16), T([360, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 360, [True, True, False]), {})
cnt: 1, ((T([128, 360, 14, 14], f16), T([128, 72, 14, 14], f16), T([360, 72, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 72, 14, 14], f16), T([128, 216, 14, 14], f16), T([72, 216, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 216, 14, 14], f16), T([128, 216, 14, 14], f16), T([216, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 216, [True, True, False]), {})
cnt: 4, ((T([128, 216, 14, 14], f16), T([128, 72, 14, 14], f16), T([216, 72, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 72, 14, 14], f16), T([128, 200, 14, 14], f16), T([72, 200, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 200, 14, 14], f16), T([128, 200, 28, 28], f16), T([200, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 200, [True, True, False]), {})
cnt: 1, ((T([128, 200, 28, 28], f16), T([128, 40, 28, 28], f16), T([200, 40, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 5, ((T([128, 40, 28, 28], f16), T([128, 120, 28, 28], f16), T([40, 120, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 120, 1, 1], f16), T([128, 16, 1, 1], f16), T([120, 16, 1, 1], f16), [120], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 4, ((T([128, 16, 1, 1], f16), T([128, 120, 1, 1], f16), T([16, 120, 1, 1], f16), [16], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 4, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16), T([120, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 120, [True, True, False]), {})
cnt: 4, ((T([128, 120, 28, 28], f16), T([128, 40, 28, 28], f16), T([120, 40, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 120, 1, 1], f16), T([128, 8, 1, 1], f16), T([120, 8, 1, 1], f16), [120], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 8, 1, 1], f16), T([128, 120, 1, 1], f16), T([8, 120, 1, 1], f16), [8], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 120, 28, 28], f16), T([128, 120, 56, 56], f16), T([120, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 120, [True, True, False]), {})
cnt: 1, ((T([128, 120, 56, 56], f16), T([128, 24, 56, 56], f16), T([120, 24, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 24, 56, 56], f16), T([128, 48, 56, 56], f16), T([24, 48, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 48, 56, 56], f16), T([128, 48, 56, 56], f16), T([48, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 48, [True, True, False]), {})
cnt: 3, ((T([128, 48, 56, 56], f16), T([128, 24, 56, 56], f16), T([48, 24, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 24, 56, 56], f16), T([128, 64, 56, 56], f16), T([24, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([128, 64, 112, 112], f16), T([64, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([128, 16, 112, 112], f16), T([64, 16, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), T([16, 16, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), T([16, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 16, [True, True, False]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 3, 224, 224], f16), T([16, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([128, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 1344, 7, 7], f16, stride=(1344, 1, 0, 0)), 49), {})
cnt: 1, ((T([128, 1104, 7, 7], f16, stride=(1104, 1, 0, 0)), 49), {})
cnt: 5, ((T([128, 736, 7, 7], f16, stride=(736, 1, 0, 0)), 49), {})
cnt: 1, ((T([128, 720, 7, 7], f16, stride=(720, 1, 0, 0)), 49), {})
cnt: 6, ((T([128, 360, 14, 14], f16, stride=(360, 1, 0, 0)), 196), {})
cnt: 5, ((T([128, 120, 28, 28], f16, stride=(120, 1, 0, 0)), 784), {})
Operator: aten.hardsigmoid.default
cnt: 5, ((T([128, 120, 1, 1], f16),), {})
cnt: 6, ((T([128, 360, 1, 1], f16),), {})
cnt: 1, ((T([128, 720, 1, 1], f16),), {})
cnt: 5, ((T([128, 736, 1, 1], f16),), {})
cnt: 1, ((T([128, 1104, 1, 1], f16),), {})
Operator: aten.hardsigmoid_backward.default
cnt: 1, ((T([128, 1104, 1, 1], f16), T([128, 1104, 1, 1], f16)), {})
cnt: 5, ((T([128, 736, 1, 1], f16), T([128, 736, 1, 1], f16)), {})
cnt: 1, ((T([128, 720, 1, 1], f16), T([128, 720, 1, 1], f16)), {})
cnt: 6, ((T([128, 360, 1, 1], f16), T([128, 360, 1, 1], f16)), {})
cnt: 5, ((T([128, 120, 1, 1], f16), T([128, 120, 1, 1], f16)), {})
Operator: aten.hardswish_.default
cnt: 3, ((T([128, 16, 112, 112], f16),), {})
cnt: 1, ((T([128, 64, 112, 112], f16),), {})
cnt: 1, ((T([128, 64, 56, 56], f16),), {})
cnt: 6, ((T([128, 48, 56, 56], f16),), {})
cnt: 1, ((T([128, 120, 56, 56], f16),), {})
cnt: 9, ((T([128, 120, 28, 28], f16),), {})
cnt: 1, ((T([128, 8, 1, 1], f16),), {})
cnt: 4, ((T([128, 16, 1, 1], f16),), {})
cnt: 1, ((T([128, 200, 28, 28], f16),), {})
cnt: 1, ((T([128, 200, 14, 14], f16),), {})
cnt: 8, ((T([128, 216, 14, 14], f16),), {})
cnt: 12, ((T([128, 360, 14, 14], f16),), {})
cnt: 1, ((T([128, 24, 1, 1], f16),), {})
cnt: 6, ((T([128, 32, 1, 1], f16),), {})
cnt: 1, ((T([128, 720, 14, 14], f16),), {})
cnt: 1, ((T([128, 720, 7, 7], f16),), {})
cnt: 10, ((T([128, 736, 7, 7], f16),), {})
cnt: 6, ((T([128, 48, 1, 1], f16),), {})
cnt: 2, ((T([128, 1104, 7, 7], f16),), {})
cnt: 1, ((T([128, 1344, 7, 7], f16),), {})
cnt: 1, ((T([128, 1984, 1, 1], f16),), {})
Operator: aten.hardswish_backward.default
cnt: 1, ((T([128, 1984, 1, 1], f16), T([128, 1984, 1, 1], f16)), {})
cnt: 1, ((T([128, 1344, 7, 7], f16), T([128, 1344, 7, 7], f16)), {})
cnt: 6, ((T([128, 48, 1, 1], f16), T([128, 48, 1, 1], f16)), {})
cnt: 2, ((T([128, 1104, 7, 7], f16), T([128, 1104, 7, 7], f16)), {})
cnt: 10, ((T([128, 736, 7, 7], f16), T([128, 736, 7, 7], f16)), {})
cnt: 6, ((T([128, 32, 1, 1], f16), T([128, 32, 1, 1], f16)), {})
cnt: 1, ((T([128, 720, 7, 7], f16), T([128, 720, 7, 7], f16)), {})
cnt: 1, ((T([128, 720, 14, 14], f16), T([128, 720, 14, 14], f16)), {})
cnt: 12, ((T([128, 360, 14, 14], f16), T([128, 360, 14, 14], f16)), {})
cnt: 1, ((T([128, 24, 1, 1], f16), T([128, 24, 1, 1], f16)), {})
cnt: 8, ((T([128, 216, 14, 14], f16), T([128, 216, 14, 14], f16)), {})
cnt: 1, ((T([128, 200, 14, 14], f16), T([128, 200, 14, 14], f16)), {})
cnt: 1, ((T([128, 200, 28, 28], f16), T([128, 200, 28, 28], f16)), {})
cnt: 4, ((T([128, 16, 1, 1], f16), T([128, 16, 1, 1], f16)), {})
cnt: 9, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16)), {})
cnt: 1, ((T([128, 8, 1, 1], f16), T([128, 8, 1, 1], f16)), {})
cnt: 1, ((T([128, 120, 56, 56], f16), T([128, 120, 56, 56], f16)), {})
cnt: 6, ((T([128, 48, 56, 56], f16), T([128, 48, 56, 56], f16)), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([128, 64, 56, 56], f16)), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([128, 64, 112, 112], f16)), {})
cnt: 3, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.mean.dim
cnt: 5, ((T([128, 120, 28, 28], f16), [2, 3], True), {})
cnt: 6, ((T([128, 360, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([128, 720, 7, 7], f16), [2, 3], True), {})
cnt: 5, ((T([128, 736, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 1104, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 1344, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 1984], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 1984], f16)), {})
Operator: aten.mul.Tensor
cnt: 10, ((T([128, 120, 28, 28], f16), T([128, 120, 1, 1], f16)), {})
cnt: 12, ((T([128, 360, 14, 14], f16), T([128, 360, 1, 1], f16)), {})
cnt: 2, ((T([128, 720, 7, 7], f16), T([128, 720, 1, 1], f16)), {})
cnt: 10, ((T([128, 736, 7, 7], f16), T([128, 736, 1, 1], f16)), {})
cnt: 2, ((T([128, 1104, 7, 7], f16), T([128, 1104, 1, 1], f16)), {})
cnt: 1, ((T([128, 1104, 7, 7], f16), T([128, 1104, 7, 7], f16)), {})
cnt: 5, ((T([128, 736, 7, 7], f16), T([128, 736, 7, 7], f16)), {})
cnt: 1, ((T([128, 720, 7, 7], f16), T([128, 720, 7, 7], f16)), {})
cnt: 6, ((T([128, 360, 14, 14], f16), T([128, 360, 14, 14], f16)), {})
cnt: 5, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 5, ((T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 24, 56, 56], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f16), True, 0.1, 1e-05), {})
cnt: 6, ((T([128, 48, 56, 56], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 120, 56, 56], f16), T([120], f16), T([120], f16), T([120], f16), T([120], f16), True, 0.1, 1e-05), {})
cnt: 9, ((T([128, 120, 28, 28], f16), T([120], f16), T([120], f16), T([120], f16), T([120], f16), True, 0.1, 1e-05), {})
cnt: 5, ((T([128, 40, 28, 28], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 200, 28, 28], f16), T([200], f16), T([200], f16), T([200], f16), T([200], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 200, 14, 14], f16), T([200], f16), T([200], f16), T([200], f16), T([200], f16), True, 0.1, 1e-05), {})
cnt: 5, ((T([128, 72, 14, 14], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f16), True, 0.1, 1e-05), {})
cnt: 8, ((T([128, 216, 14, 14], f16), T([216], f16), T([216], f16), T([216], f16), T([216], f16), True, 0.1, 1e-05), {})
cnt: 12, ((T([128, 360, 14, 14], f16), T([360], f16), T([360], f16), T([360], f16), T([360], f16), True, 0.1, 1e-05), {})
cnt: 6, ((T([128, 120, 14, 14], f16), T([120], f16), T([120], f16), T([120], f16), T([120], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 720, 14, 14], f16), T([720], f16), T([720], f16), T([720], f16), T([720], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 720, 7, 7], f16), T([720], f16), T([720], f16), T([720], f16), T([720], f16), True, 0.1, 1e-05), {})
cnt: 6, ((T([128, 184, 7, 7], f16), T([184], f16), T([184], f16), T([184], f16), T([184], f16), True, 0.1, 1e-05), {})
cnt: 10, ((T([128, 736, 7, 7], f16), T([736], f16), T([736], f16), T([736], f16), T([736], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 1104, 7, 7], f16), T([1104], f16), T([1104], f16), T([1104], f16), T([1104], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 224, 7, 7], f16), T([224], f16), T([224], f16), T([224], f16), T([224], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 1344, 7, 7], f16), T([1344], f16), T([1344], f16), T([1344], f16), T([1344], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([128, 1344, 7, 7], f16), T([128, 1344, 7, 7], f16), T([1344], f16), T([1344], f16), T([1344], f16), T([1344], f32), T([1344], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 224, 7, 7], f16), T([128, 224, 7, 7], f16), T([224], f16), T([224], f16), T([224], f16), T([224], f32), T([224], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 1104, 7, 7], f16), T([128, 1104, 7, 7], f16), T([1104], f16), T([1104], f16), T([1104], f16), T([1104], f32), T([1104], f32), True, 1e-05, [True, True, True]), {})
cnt: 6, ((T([128, 184, 7, 7], f16), T([128, 184, 7, 7], f16), T([184], f16), T([184], f16), T([184], f16), T([184], f32), T([184], f32), True, 1e-05, [True, True, True]), {})
cnt: 10, ((T([128, 736, 7, 7], f16), T([128, 736, 7, 7], f16), T([736], f16), T([736], f16), T([736], f16), T([736], f32), T([736], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 720, 7, 7], f16), T([128, 720, 7, 7], f16), T([720], f16), T([720], f16), T([720], f16), T([720], f32), T([720], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 720, 14, 14], f16), T([128, 720, 14, 14], f16), T([720], f16), T([720], f16), T([720], f16), T([720], f32), T([720], f32), True, 1e-05, [True, True, True]), {})
cnt: 6, ((T([128, 120, 14, 14], f16), T([128, 120, 14, 14], f16), T([120], f16), T([120], f16), T([120], f16), T([120], f32), T([120], f32), True, 1e-05, [True, True, True]), {})
cnt: 12, ((T([128, 360, 14, 14], f16), T([128, 360, 14, 14], f16), T([360], f16), T([360], f16), T([360], f16), T([360], f32), T([360], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([128, 72, 14, 14], f16), T([128, 72, 14, 14], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f32), T([72], f32), True, 1e-05, [True, True, True]), {})
cnt: 8, ((T([128, 216, 14, 14], f16), T([128, 216, 14, 14], f16), T([216], f16), T([216], f16), T([216], f16), T([216], f32), T([216], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 200, 14, 14], f16), T([128, 200, 14, 14], f16), T([200], f16), T([200], f16), T([200], f16), T([200], f32), T([200], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 200, 28, 28], f16), T([128, 200, 28, 28], f16), T([200], f16), T([200], f16), T([200], f16), T([200], f32), T([200], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([128, 40, 28, 28], f16), T([128, 40, 28, 28], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f32), T([40], f32), True, 1e-05, [True, True, True]), {})
cnt: 9, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16), T([120], f16), T([120], f16), T([120], f16), T([120], f32), T([120], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 120, 56, 56], f16), T([128, 120, 56, 56], f16), T([120], f16), T([120], f16), T([120], f16), T([120], f32), T([120], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f32), T([24], f32), True, 1e-05, [True, True, True]), {})
cnt: 6, ((T([128, 48, 56, 56], f16), T([128, 48, 56, 56], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f32), T([48], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([128, 64, 56, 56], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([128, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f32), T([16], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
cnt: 1, ((T([128, 1104, 7, 7], f16), [2, 3], True), {})
cnt: 5, ((T([128, 736, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 720, 7, 7], f16), [2, 3], True), {})
cnt: 6, ((T([128, 360, 14, 14], f16), [2, 3], True), {})
cnt: 5, ((T([128, 120, 28, 28], f16), [2, 3], True), {})

View File

@ -0,0 +1,118 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 57, ((T([], i64), 1), {})
cnt: 2, ((T([128, 128, 64, 64], f16), T([128, 128, 64, 64], f16)), {})
cnt: 4, ((T([128, 192, 32, 32], f16), T([128, 192, 32, 32], f16)), {})
cnt: 12, ((T([128, 640, 16, 16], f16), T([128, 640, 16, 16], f16)), {})
cnt: 17, ((T([128, 640, 8, 8], f16), T([128, 640, 8, 8], f16)), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 32, 128, 128], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 2560], f16), T([2560, 1000], f16, stride=(1, 2560))), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 256, 256], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 256, 256], f16), T([32, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 32, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 128, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 32, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([192, 128, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 192, 32, 32], f16), T([192, 192, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([192, 128, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 32, 32], f16), T([160, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 160, 32, 32], f16), T([160, 160, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 6, ((T([128, 160, 16, 16], f16), T([640, 160, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 32, 32], f16), T([640, 192, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 640, 16, 16], f16), T([160, 640, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 160, 16, 16], f16), T([160, 160, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 640, 16, 16], f16), T([1920, 640, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1920, 16, 16], f16), T([1920, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1920), {})
cnt: 9, ((T([128, 1920, 8, 8], f16), T([640, 1920, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 640, 16, 16], f16), T([640, 640, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([128, 640, 8, 8], f16), T([1920, 640, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([128, 1920, 8, 8], f16), T([1920, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1920), {})
cnt: 1, ((T([128, 640, 8, 8], f16), T([2560, 640, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 2560, 8, 8], f16), T([128, 640, 8, 8], f16), T([2560, 640, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 9, ((T([128, 640, 8, 8], f16), T([128, 1920, 8, 8], f16), T([640, 1920, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 8, ((T([128, 1920, 8, 8], f16), T([128, 1920, 8, 8], f16), T([1920, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1920, [True, True, False]), {})
cnt: 8, ((T([128, 1920, 8, 8], f16), T([128, 640, 8, 8], f16), T([1920, 640, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 640, 8, 8], f16), T([128, 640, 16, 16], f16), T([640, 640, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1920, 8, 8], f16), T([128, 1920, 16, 16], f16), T([1920, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1920, [True, True, False]), {})
cnt: 1, ((T([128, 1920, 16, 16], f16), T([128, 640, 16, 16], f16), T([1920, 640, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 6, ((T([128, 640, 16, 16], f16), T([128, 160, 16, 16], f16), T([640, 160, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 5, ((T([128, 160, 16, 16], f16), T([128, 160, 16, 16], f16), T([160, 160, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 5, ((T([128, 160, 16, 16], f16), T([128, 640, 16, 16], f16), T([160, 640, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 640, 16, 16], f16), T([128, 192, 32, 32], f16), T([640, 192, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 160, 16, 16], f16), T([128, 160, 32, 32], f16), T([160, 160, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 160, 32, 32], f16), T([128, 192, 32, 32], f16), T([160, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 192, 32, 32], f16), T([128, 192, 32, 32], f16), T([192, 192, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 32, 32], f16), T([128, 128, 64, 64], f16), T([192, 128, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 32, 32], f16), T([128, 128, 64, 64], f16), T([192, 128, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 32, 128, 128], f16), T([128, 32, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 128, 64, 64], f16), T([128, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 64, 64], f16), T([128, 32, 128, 128], f16), T([128, 32, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 3, 256, 256], f16), T([32, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 256, 256], f16), T([128, 3, 256, 256], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 2560, 8, 8], f16, stride=(2560, 1, 0, 0)), 64), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 2560, 8, 8], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 2560], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 2560], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([128, 32, 128, 128], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 128, 64, 64], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 5, ((T([128, 192, 32, 32], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 160, 32, 32], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f16), True, 0.1, 1e-05), {})
cnt: 11, ((T([128, 160, 16, 16], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f16), True, 0.1, 1e-05), {})
cnt: 7, ((T([128, 640, 16, 16], f16), T([640], f16), T([640], f16), T([640], f16), T([640], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 1920, 16, 16], f16), T([1920], f16), T([1920], f16), T([1920], f16), T([1920], f16), True, 0.1, 1e-05), {})
cnt: 17, ((T([128, 1920, 8, 8], f16), T([1920], f16), T([1920], f16), T([1920], f16), T([1920], f16), True, 0.1, 1e-05), {})
cnt: 10, ((T([128, 640, 8, 8], f16), T([640], f16), T([640], f16), T([640], f16), T([640], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 2560, 8, 8], f16), T([2560], f16), T([2560], f16), T([2560], f16), T([2560], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([128, 2560, 8, 8], f16), T([128, 2560, 8, 8], f16), T([2560], f16), T([2560], f16), T([2560], f16), T([2560], f32), T([2560], f32), True, 1e-05, [True, True, True]), {})
cnt: 10, ((T([128, 640, 8, 8], f16), T([128, 640, 8, 8], f16), T([640], f16), T([640], f16), T([640], f16), T([640], f32), T([640], f32), True, 1e-05, [True, True, True]), {})
cnt: 17, ((T([128, 1920, 8, 8], f16), T([128, 1920, 8, 8], f16), T([1920], f16), T([1920], f16), T([1920], f16), T([1920], f32), T([1920], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 1920, 16, 16], f16), T([128, 1920, 16, 16], f16), T([1920], f16), T([1920], f16), T([1920], f16), T([1920], f32), T([1920], f32), True, 1e-05, [True, True, True]), {})
cnt: 7, ((T([128, 640, 16, 16], f16), T([128, 640, 16, 16], f16), T([640], f16), T([640], f16), T([640], f16), T([640], f32), T([640], f32), True, 1e-05, [True, True, True]), {})
cnt: 11, ((T([128, 160, 16, 16], f16), T([128, 160, 16, 16], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f32), T([160], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 160, 32, 32], f16), T([128, 160, 32, 32], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f32), T([160], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([128, 192, 32, 32], f16), T([128, 192, 32, 32], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 128, 64, 64], f16), T([128, 128, 64, 64], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 32, 128, 128], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 1, ((T([128, 32, 128, 128], f16),), {})
cnt: 2, ((T([128, 128, 64, 64], f16),), {})
cnt: 4, ((T([128, 192, 32, 32], f16),), {})
cnt: 1, ((T([128, 160, 32, 32], f16),), {})
cnt: 11, ((T([128, 160, 16, 16], f16),), {})
cnt: 6, ((T([128, 640, 16, 16], f16),), {})
cnt: 1, ((T([128, 1920, 16, 16], f16),), {})
cnt: 17, ((T([128, 1920, 8, 8], f16),), {})
cnt: 9, ((T([128, 640, 8, 8], f16),), {})
cnt: 1, ((T([128, 2560, 8, 8], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 1, ((T([128, 2560, 8, 8], f16), T([128, 2560, 8, 8], f16), 0), {})
cnt: 9, ((T([128, 640, 8, 8], f16), T([128, 640, 8, 8], f16), 0), {})
cnt: 17, ((T([128, 1920, 8, 8], f16), T([128, 1920, 8, 8], f16), 0), {})
cnt: 1, ((T([128, 1920, 16, 16], f16), T([128, 1920, 16, 16], f16), 0), {})
cnt: 6, ((T([128, 640, 16, 16], f16), T([128, 640, 16, 16], f16), 0), {})
cnt: 11, ((T([128, 160, 16, 16], f16), T([128, 160, 16, 16], f16), 0), {})
cnt: 1, ((T([128, 160, 32, 32], f16), T([128, 160, 32, 32], f16), 0), {})
cnt: 4, ((T([128, 192, 32, 32], f16), T([128, 192, 32, 32], f16), 0), {})
cnt: 2, ((T([128, 128, 64, 64], f16), T([128, 128, 64, 64], f16), 0), {})
cnt: 1, ((T([128, 32, 128, 128], f16), T([128, 32, 128, 128], f16), 0), {})

View File

@ -0,0 +1,411 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 1, ((T([], i64), 1), {})
cnt: 5, ((T([128, 80, 7, 7], f16, stride=(7840, 49, 7, 1)), T([128, 80, 7, 7], f16)), {})
cnt: 2, ((T([128, 960, 7, 7], f16), T([128, 960, 7, 7], f16)), {})
cnt: 4, ((T([128, 480, 7, 7], f16, stride=(47040, 49, 7, 1)), T([128, 480, 7, 7], f16)), {})
cnt: 4, ((T([128, 160, 7, 7], f16), T([128, 160, 7, 7], f16)), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([128, 672, 7, 7], f16)), {})
cnt: 2, ((T([128, 336, 14, 14], f16, stride=(131712, 196, 14, 1)), T([128, 336, 14, 14], f16)), {})
cnt: 2, ((T([128, 112, 14, 14], f16), T([128, 112, 14, 14], f16)), {})
cnt: 2, ((T([128, 56, 14, 14], f16, stride=(21952, 196, 14, 1)), T([128, 56, 14, 14], f16)), {})
cnt: 1, ((T([128, 672, 14, 14], f16), T([128, 672, 14, 14], f16)), {})
cnt: 1, ((T([128, 480, 14, 14], f16), T([128, 480, 14, 14], f16)), {})
cnt: 1, ((T([128, 240, 14, 14], f16, stride=(94080, 196, 14, 1)), T([128, 240, 14, 14], f16)), {})
cnt: 4, ((T([128, 80, 14, 14], f16), T([128, 80, 14, 14], f16)), {})
cnt: 4, ((T([128, 40, 14, 14], f16, stride=(15680, 196, 14, 1)), T([128, 40, 14, 14], f16)), {})
cnt: 2, ((T([128, 92, 14, 14], f16, stride=(36064, 196, 14, 1)), T([128, 92, 14, 14], f16)), {})
cnt: 1, ((T([128, 100, 14, 14], f16, stride=(39200, 196, 14, 1)), T([128, 100, 14, 14], f16)), {})
cnt: 1, ((T([128, 120, 28, 28], f16, stride=(188160, 784, 28, 1)), T([128, 120, 28, 28], f16)), {})
cnt: 2, ((T([128, 40, 28, 28], f16), T([128, 40, 28, 28], f16)), {})
cnt: 2, ((T([128, 20, 28, 28], f16, stride=(31360, 784, 28, 1)), T([128, 20, 28, 28], f16)), {})
cnt: 1, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16)), {})
cnt: 1, ((T([128, 60, 28, 28], f16, stride=(94080, 784, 28, 1)), T([128, 60, 28, 28], f16)), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([128, 72, 28, 28], f16)), {})
cnt: 2, ((T([128, 36, 56, 56], f16, stride=(225792, 3136, 56, 1)), T([128, 36, 56, 56], f16)), {})
cnt: 2, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16)), {})
cnt: 2, ((T([128, 12, 56, 56], f16, stride=(75264, 3136, 56, 1)), T([128, 12, 56, 56], f16)), {})
cnt: 1, ((T([128, 24, 112, 112], f16, stride=(602112, 12544, 112, 1)), T([128, 24, 112, 112], f16)), {})
cnt: 2, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16)), {})
cnt: 2, ((T([128, 8, 112, 112], f16, stride=(200704, 12544, 112, 1)), T([128, 8, 112, 112], f16)), {})
Operator: aten.add_.Tensor
cnt: 79, ((T([], i64), 1), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16)), {})
cnt: 2, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16)), {})
cnt: 2, ((T([128, 40, 28, 28], f16), T([128, 40, 28, 28], f16)), {})
cnt: 4, ((T([128, 80, 14, 14], f16), T([128, 80, 14, 14], f16)), {})
cnt: 2, ((T([128, 112, 14, 14], f16), T([128, 112, 14, 14], f16)), {})
cnt: 5, ((T([128, 160, 7, 7], f16), T([128, 160, 7, 7], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 1280], f16), T([1280, 1000], f16, stride=(1, 1280))), {})
Operator: aten.cat.default
cnt: 2, (([T([128, 8, 112, 112], f16), T([128, 8, 112, 112], f16)], 1), {})
cnt: 1, (([T([128, 24, 112, 112], f16), T([128, 24, 112, 112], f16)], 1), {})
cnt: 2, (([T([128, 12, 56, 56], f16), T([128, 12, 56, 56], f16)], 1), {})
cnt: 2, (([T([128, 36, 56, 56], f16), T([128, 36, 56, 56], f16)], 1), {})
cnt: 2, (([T([128, 20, 28, 28], f16), T([128, 20, 28, 28], f16)], 1), {})
cnt: 1, (([T([128, 60, 28, 28], f16), T([128, 60, 28, 28], f16)], 1), {})
cnt: 1, (([T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16)], 1), {})
cnt: 4, (([T([128, 40, 14, 14], f16), T([128, 40, 14, 14], f16)], 1), {})
cnt: 1, (([T([128, 100, 14, 14], f16), T([128, 100, 14, 14], f16)], 1), {})
cnt: 2, (([T([128, 92, 14, 14], f16), T([128, 92, 14, 14], f16)], 1), {})
cnt: 1, (([T([128, 240, 14, 14], f16), T([128, 240, 14, 14], f16)], 1), {})
cnt: 2, (([T([128, 56, 14, 14], f16), T([128, 56, 14, 14], f16)], 1), {})
cnt: 2, (([T([128, 336, 14, 14], f16), T([128, 336, 14, 14], f16)], 1), {})
cnt: 5, (([T([128, 80, 7, 7], f16), T([128, 80, 7, 7], f16)], 1), {})
cnt: 4, (([T([128, 480, 7, 7], f16), T([128, 480, 7, 7], f16)], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([16, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 16, 112, 112], f16), T([8, 16, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 8, 112, 112], f16), T([8, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 8), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([24, 16, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 24, 112, 112], f16), T([24, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 24), {})
cnt: 1, ((T([128, 48, 112, 112], f16), T([48, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 48), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([12, 48, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 12, 56, 56], f16), T([12, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 12), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([16, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 16), {})
cnt: 1, ((T([128, 16, 56, 56], f16), T([24, 16, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 24, 56, 56], f16), T([36, 24, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 36, 56, 56], f16), T([36, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 36), {})
cnt: 1, ((T([128, 72, 56, 56], f16), T([12, 72, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 72, 56, 56], f16), T([72, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 72), {})
cnt: 1, ((T([128, 72, 1, 1], f16), T([20, 72, 1, 1], f16), T([20], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 20, 1, 1], f16), T([72, 20, 1, 1], f16), T([72], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([20, 72, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 20, 28, 28], f16), T([20, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 20), {})
cnt: 1, ((T([128, 24, 56, 56], f16), T([24, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 24), {})
cnt: 1, ((T([128, 24, 28, 28], f16), T([40, 24, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 40, 28, 28], f16), T([60, 40, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 60, 28, 28], f16), T([60, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 60), {})
cnt: 1, ((T([128, 120, 1, 1], f16), T([32, 120, 1, 1], f16), T([32], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 1, 1], f16), T([120, 32, 1, 1], f16), T([120], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 120, 28, 28], f16), T([20, 120, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 40, 28, 28], f16), T([120, 40, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 120, 28, 28], f16), T([120, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 120), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([240, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 240), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([40, 240, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 40, 14, 14], f16), T([40, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 40), {})
cnt: 1, ((T([128, 40, 28, 28], f16), T([40, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 40), {})
cnt: 1, ((T([128, 40, 14, 14], f16), T([80, 40, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 80, 14, 14], f16), T([100, 80, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 100, 14, 14], f16), T([100, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 100), {})
cnt: 1, ((T([128, 200, 14, 14], f16), T([40, 200, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 80, 14, 14], f16), T([92, 80, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 92, 14, 14], f16), T([92, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 92), {})
cnt: 2, ((T([128, 184, 14, 14], f16), T([40, 184, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 80, 14, 14], f16), T([240, 80, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([240, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 240), {})
cnt: 1, ((T([128, 480, 1, 1], f16), T([120, 480, 1, 1], f16), T([120], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 120, 1, 1], f16), T([480, 120, 1, 1], f16), T([480], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 480, 14, 14], f16), T([56, 480, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 56, 14, 14], f16), T([56, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 56), {})
cnt: 1, ((T([128, 80, 14, 14], f16), T([80, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 80), {})
cnt: 1, ((T([128, 80, 14, 14], f16), T([112, 80, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 112, 14, 14], f16), T([336, 112, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 336, 14, 14], f16), T([336, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 336), {})
cnt: 2, ((T([128, 672, 1, 1], f16), T([168, 672, 1, 1], f16), T([168], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 168, 1, 1], f16), T([672, 168, 1, 1], f16), T([672], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 672, 14, 14], f16), T([56, 672, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 672, 14, 14], f16), T([672, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 672), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([80, 672, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 80, 7, 7], f16), T([80, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 80), {})
cnt: 1, ((T([128, 112, 14, 14], f16), T([112, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 112), {})
cnt: 1, ((T([128, 112, 7, 7], f16), T([160, 112, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 160, 7, 7], f16), T([480, 160, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 480, 7, 7], f16), T([480, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 480), {})
cnt: 4, ((T([128, 960, 7, 7], f16), T([80, 960, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 960, 1, 1], f16), T([240, 960, 1, 1], f16), T([240], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 240, 1, 1], f16), T([960, 240, 1, 1], f16), T([960], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 160, 7, 7], f16), T([960, 160, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 960, 1, 1], f16), T([1280, 960, 1, 1], f16), T([1280], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 1280, 1, 1], f16), T([128, 960, 1, 1], f16), T([1280, 960, 1, 1], f16), [1280], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 960, 7, 7], f16), T([128, 160, 7, 7], f16), T([960, 160, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 5, ((T([128, 80, 7, 7], f16), T([128, 80, 7, 7], f16), T([80, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 80, [True, True, False]), {})
cnt: 4, ((T([128, 80, 7, 7], f16), T([128, 960, 7, 7], f16), T([80, 960, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 960, 1, 1], f16), T([128, 240, 1, 1], f16), T([960, 240, 1, 1], f16), [960], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 240, 1, 1], f16), T([128, 960, 1, 1], f16), T([240, 960, 1, 1], f16), [240], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 4, ((T([128, 480, 7, 7], f16), T([128, 480, 7, 7], f16), T([480, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 480, [True, True, False]), {})
cnt: 4, ((T([128, 480, 7, 7], f16), T([128, 160, 7, 7], f16), T([480, 160, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 160, 7, 7], f16), T([128, 112, 7, 7], f16), T([160, 112, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 112, 7, 7], f16), T([128, 112, 14, 14], f16), T([112, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 112, [True, True, False]), {})
cnt: 1, ((T([128, 80, 7, 7], f16), T([128, 672, 7, 7], f16), T([80, 672, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 672, 1, 1], f16), T([128, 168, 1, 1], f16), T([672, 168, 1, 1], f16), [672], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 168, 1, 1], f16), T([128, 672, 1, 1], f16), T([168, 672, 1, 1], f16), [168], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([128, 672, 14, 14], f16), T([672, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 672, [True, True, False]), {})
cnt: 2, ((T([128, 336, 14, 14], f16), T([128, 336, 14, 14], f16), T([336, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 336, [True, True, False]), {})
cnt: 2, ((T([128, 336, 14, 14], f16), T([128, 112, 14, 14], f16), T([336, 112, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 56, 14, 14], f16), T([128, 56, 14, 14], f16), T([56, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 56, [True, True, False]), {})
cnt: 1, ((T([128, 56, 14, 14], f16), T([128, 672, 14, 14], f16), T([56, 672, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 112, 14, 14], f16), T([128, 80, 14, 14], f16), T([112, 80, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 80, 14, 14], f16), T([128, 80, 14, 14], f16), T([80, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 80, [True, True, False]), {})
cnt: 1, ((T([128, 56, 14, 14], f16), T([128, 480, 14, 14], f16), T([56, 480, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 480, 1, 1], f16), T([128, 120, 1, 1], f16), T([480, 120, 1, 1], f16), [480], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 120, 1, 1], f16), T([128, 480, 1, 1], f16), T([120, 480, 1, 1], f16), [120], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([128, 240, 14, 14], f16), T([240, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 240, [True, True, False]), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([128, 80, 14, 14], f16), T([240, 80, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 40, 14, 14], f16), T([128, 40, 14, 14], f16), T([40, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 40, [True, True, False]), {})
cnt: 2, ((T([128, 40, 14, 14], f16), T([128, 184, 14, 14], f16), T([40, 184, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 92, 14, 14], f16), T([128, 92, 14, 14], f16), T([92, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 92, [True, True, False]), {})
cnt: 2, ((T([128, 92, 14, 14], f16), T([128, 80, 14, 14], f16), T([92, 80, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 40, 14, 14], f16), T([128, 200, 14, 14], f16), T([40, 200, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 100, 14, 14], f16), T([128, 100, 14, 14], f16), T([100, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 100, [True, True, False]), {})
cnt: 1, ((T([128, 100, 14, 14], f16), T([128, 80, 14, 14], f16), T([100, 80, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 80, 14, 14], f16), T([128, 40, 14, 14], f16), T([80, 40, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 40, 14, 14], f16), T([128, 40, 28, 28], f16), T([40, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 40, [True, True, False]), {})
cnt: 1, ((T([128, 40, 14, 14], f16), T([128, 240, 14, 14], f16), T([40, 240, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([128, 240, 28, 28], f16), T([240, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 240, [True, True, False]), {})
cnt: 1, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16), T([120, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 120, [True, True, False]), {})
cnt: 1, ((T([128, 120, 28, 28], f16), T([128, 40, 28, 28], f16), T([120, 40, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 20, 28, 28], f16), T([128, 20, 28, 28], f16), T([20, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 20, [True, True, False]), {})
cnt: 1, ((T([128, 20, 28, 28], f16), T([128, 120, 28, 28], f16), T([20, 120, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 120, 1, 1], f16), T([128, 32, 1, 1], f16), T([120, 32, 1, 1], f16), [120], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 32, 1, 1], f16), T([128, 120, 1, 1], f16), T([32, 120, 1, 1], f16), [32], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 60, 28, 28], f16), T([128, 60, 28, 28], f16), T([60, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 60, [True, True, False]), {})
cnt: 1, ((T([128, 60, 28, 28], f16), T([128, 40, 28, 28], f16), T([60, 40, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 40, 28, 28], f16), T([128, 24, 28, 28], f16), T([40, 24, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 24, 28, 28], f16), T([128, 24, 56, 56], f16), T([24, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 24, [True, True, False]), {})
cnt: 1, ((T([128, 20, 28, 28], f16), T([128, 72, 28, 28], f16), T([20, 72, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 72, 1, 1], f16), T([128, 20, 1, 1], f16), T([72, 20, 1, 1], f16), [72], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 20, 1, 1], f16), T([128, 72, 1, 1], f16), T([20, 72, 1, 1], f16), [20], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([128, 72, 56, 56], f16), T([72, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 72, [True, True, False]), {})
cnt: 2, ((T([128, 36, 56, 56], f16), T([128, 36, 56, 56], f16), T([36, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 36, [True, True, False]), {})
cnt: 2, ((T([128, 36, 56, 56], f16), T([128, 24, 56, 56], f16), T([36, 24, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 12, 56, 56], f16), T([128, 12, 56, 56], f16), T([12, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 12, [True, True, False]), {})
cnt: 1, ((T([128, 12, 56, 56], f16), T([128, 72, 56, 56], f16), T([12, 72, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 24, 56, 56], f16), T([128, 16, 56, 56], f16), T([24, 16, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 16, 56, 56], f16), T([128, 16, 112, 112], f16), T([16, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 16, [True, True, False]), {})
cnt: 1, ((T([128, 12, 56, 56], f16), T([128, 48, 56, 56], f16), T([12, 48, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([128, 48, 112, 112], f16), T([48, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 48, [True, True, False]), {})
cnt: 1, ((T([128, 24, 112, 112], f16), T([128, 24, 112, 112], f16), T([24, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 24, [True, True, False]), {})
cnt: 1, ((T([128, 24, 112, 112], f16), T([128, 16, 112, 112], f16), T([24, 16, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 8, 112, 112], f16), T([128, 8, 112, 112], f16), T([8, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 8, [True, True, False]), {})
cnt: 2, ((T([128, 8, 112, 112], f16), T([128, 16, 112, 112], f16), T([8, 16, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 3, 224, 224], f16), T([16, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([128, 3, 224, 224], f16)), {})
cnt: 15, ((T([128, 160, 7, 7], f16), T([128, 160, 7, 7], f16)), {})
cnt: 6, ((T([128, 112, 14, 14], f16), T([128, 112, 14, 14], f16)), {})
cnt: 12, ((T([128, 80, 14, 14], f16), T([128, 80, 14, 14], f16)), {})
cnt: 6, ((T([128, 40, 28, 28], f16), T([128, 40, 28, 28], f16)), {})
cnt: 6, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16)), {})
cnt: 3, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16)), {})
Operator: aten.div.Scalar
cnt: 3, ((T([128, 960, 7, 7], f16, stride=(960, 1, 0, 0)), 49), {})
cnt: 1, ((T([128, 672, 7, 7], f16, stride=(672, 1, 0, 0)), 49), {})
cnt: 1, ((T([128, 672, 14, 14], f16, stride=(672, 1, 0, 0)), 196), {})
cnt: 1, ((T([128, 480, 14, 14], f16, stride=(480, 1, 0, 0)), 196), {})
cnt: 1, ((T([128, 120, 28, 28], f16, stride=(120, 1, 0, 0)), 784), {})
cnt: 1, ((T([128, 72, 28, 28], f16, stride=(72, 1, 0, 0)), 784), {})
Operator: aten.hardsigmoid.default
cnt: 1, ((T([128, 72, 1, 1], f16),), {})
cnt: 1, ((T([128, 120, 1, 1], f16),), {})
cnt: 1, ((T([128, 480, 1, 1], f16),), {})
cnt: 2, ((T([128, 672, 1, 1], f16),), {})
cnt: 2, ((T([128, 960, 1, 1], f16),), {})
Operator: aten.hardsigmoid_backward.default
cnt: 2, ((T([128, 960, 1, 1], f16), T([128, 960, 1, 1], f16)), {})
cnt: 2, ((T([128, 672, 1, 1], f16), T([128, 672, 1, 1], f16)), {})
cnt: 1, ((T([128, 480, 1, 1], f16), T([128, 480, 1, 1], f16)), {})
cnt: 1, ((T([128, 120, 1, 1], f16), T([128, 120, 1, 1], f16)), {})
cnt: 1, ((T([128, 72, 1, 1], f16), T([128, 72, 1, 1], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 72, 28, 28], f16), [2, 3], True), {})
cnt: 1, ((T([128, 120, 28, 28], f16), [2, 3], True), {})
cnt: 1, ((T([128, 480, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([128, 672, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([128, 672, 7, 7], f16), [2, 3], True), {})
cnt: 2, ((T([128, 960, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 960, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 1280], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 1280], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([128, 72, 28, 28], f16), T([128, 72, 1, 1], f16)), {})
cnt: 2, ((T([128, 120, 28, 28], f16), T([128, 120, 1, 1], f16)), {})
cnt: 2, ((T([128, 480, 14, 14], f16), T([128, 480, 1, 1], f16)), {})
cnt: 2, ((T([128, 672, 14, 14], f16), T([128, 672, 1, 1], f16)), {})
cnt: 2, ((T([128, 672, 7, 7], f16), T([128, 672, 1, 1], f16)), {})
cnt: 4, ((T([128, 960, 7, 7], f16), T([128, 960, 1, 1], f16)), {})
cnt: 2, ((T([128, 960, 7, 7], f16), T([128, 960, 7, 7], f16)), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([128, 672, 7, 7], f16)), {})
cnt: 1, ((T([128, 672, 14, 14], f16), T([128, 672, 14, 14], f16)), {})
cnt: 1, ((T([128, 480, 14, 14], f16), T([128, 480, 14, 14], f16)), {})
cnt: 1, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16)), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([128, 72, 28, 28], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 8, 112, 112], f16), T([8], f16), T([8], f16), T([8], f16), T([8], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 24, 112, 112], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 12, 56, 56], f16), T([12], f16), T([12], f16), T([12], f16), T([12], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 16, 56, 56], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 24, 56, 56], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 36, 56, 56], f16), T([36], f16), T([36], f16), T([36], f16), T([36], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 20, 28, 28], f16), T([20], f16), T([20], f16), T([20], f16), T([20], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 24, 28, 28], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 40, 28, 28], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 60, 28, 28], f16), T([60], f16), T([60], f16), T([60], f16), T([60], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 120, 28, 28], f16), T([120], f16), T([120], f16), T([120], f16), T([120], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 240, 14, 14], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f16), True, 0.1, 1e-05), {})
cnt: 9, ((T([128, 40, 14, 14], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 80, 14, 14], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 100, 14, 14], f16), T([100], f16), T([100], f16), T([100], f16), T([100], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 92, 14, 14], f16), T([92], f16), T([92], f16), T([92], f16), T([92], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 56, 14, 14], f16), T([56], f16), T([56], f16), T([56], f16), T([56], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 112, 14, 14], f16), T([112], f16), T([112], f16), T([112], f16), T([112], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 336, 14, 14], f16), T([336], f16), T([336], f16), T([336], f16), T([336], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f16), True, 0.1, 1e-05), {})
cnt: 10, ((T([128, 80, 7, 7], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 112, 7, 7], f16), T([112], f16), T([112], f16), T([112], f16), T([112], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 160, 7, 7], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f16), True, 0.1, 1e-05), {})
cnt: 8, ((T([128, 480, 7, 7], f16), T([480], f16), T([480], f16), T([480], f16), T([480], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 960, 7, 7], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([128, 960, 7, 7], f16), T([128, 960, 7, 7], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f32), T([960], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([128, 80, 7, 7], f16, stride=(7840, 49, 7, 1)), T([128, 80, 7, 7], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f32), T([80], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([128, 80, 7, 7], f16), T([128, 80, 7, 7], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f32), T([80], f32), True, 1e-05, [True, True, True]), {})
cnt: 8, ((T([128, 480, 7, 7], f16), T([128, 480, 7, 7], f16), T([480], f16), T([480], f16), T([480], f16), T([480], f32), T([480], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 160, 7, 7], f16), T([128, 160, 7, 7], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f32), T([160], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 112, 7, 7], f16), T([128, 112, 7, 7], f16), T([112], f16), T([112], f16), T([112], f16), T([112], f32), T([112], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([128, 672, 7, 7], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f32), T([672], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 336, 14, 14], f16), T([128, 336, 14, 14], f16), T([336], f16), T([336], f16), T([336], f16), T([336], f32), T([336], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 56, 14, 14], f16, stride=(21952, 196, 14, 1)), T([128, 56, 14, 14], f16), T([56], f16), T([56], f16), T([56], f16), T([56], f32), T([56], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 56, 14, 14], f16), T([128, 56, 14, 14], f16), T([56], f16), T([56], f16), T([56], f16), T([56], f32), T([56], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 112, 14, 14], f16), T([128, 112, 14, 14], f16), T([112], f16), T([112], f16), T([112], f16), T([112], f32), T([112], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 80, 14, 14], f16), T([128, 80, 14, 14], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f32), T([80], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 240, 14, 14], f16), T([128, 240, 14, 14], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f32), T([240], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 40, 14, 14], f16, stride=(15680, 196, 14, 1)), T([128, 40, 14, 14], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f32), T([40], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([128, 40, 14, 14], f16), T([128, 40, 14, 14], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f32), T([40], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 92, 14, 14], f16), T([128, 92, 14, 14], f16), T([92], f16), T([92], f16), T([92], f16), T([92], f32), T([92], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 100, 14, 14], f16), T([128, 100, 14, 14], f16), T([100], f16), T([100], f16), T([100], f16), T([100], f32), T([100], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16), T([120], f16), T([120], f16), T([120], f16), T([120], f32), T([120], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 20, 28, 28], f16, stride=(31360, 784, 28, 1)), T([128, 20, 28, 28], f16), T([20], f16), T([20], f16), T([20], f16), T([20], f32), T([20], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 20, 28, 28], f16), T([128, 20, 28, 28], f16), T([20], f16), T([20], f16), T([20], f16), T([20], f32), T([20], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 60, 28, 28], f16), T([128, 60, 28, 28], f16), T([60], f16), T([60], f16), T([60], f16), T([60], f32), T([60], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 40, 28, 28], f16), T([128, 40, 28, 28], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f32), T([40], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 24, 28, 28], f16), T([128, 24, 28, 28], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f32), T([24], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([128, 72, 28, 28], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f32), T([72], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 36, 56, 56], f16), T([128, 36, 56, 56], f16), T([36], f16), T([36], f16), T([36], f16), T([36], f32), T([36], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 12, 56, 56], f16, stride=(75264, 3136, 56, 1)), T([128, 12, 56, 56], f16), T([12], f16), T([12], f16), T([12], f16), T([12], f32), T([12], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 12, 56, 56], f16), T([128, 12, 56, 56], f16), T([12], f16), T([12], f16), T([12], f16), T([12], f32), T([12], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f32), T([24], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 16, 56, 56], f16), T([128, 16, 56, 56], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f32), T([16], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([128, 48, 56, 56], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f32), T([48], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 24, 112, 112], f16), T([128, 24, 112, 112], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f32), T([24], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 8, 112, 112], f16, stride=(200704, 12544, 112, 1)), T([128, 8, 112, 112], f16), T([8], f16), T([8], f16), T([8], f16), T([8], f32), T([8], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 8, 112, 112], f16), T([128, 8, 112, 112], f16), T([8], f16), T([8], f16), T([8], f16), T([8], f32), T([8], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f32), T([16], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.new_empty_strided.default
cnt: 5, ((T([128, 160, 7, 7], f16), [128, 160, 7, 7], [7840, 49, 7, 1]), {})
cnt: 2, ((T([128, 112, 14, 14], f16), [128, 112, 14, 14], [21952, 196, 14, 1]), {})
cnt: 4, ((T([128, 80, 14, 14], f16), [128, 80, 14, 14], [15680, 196, 14, 1]), {})
cnt: 2, ((T([128, 40, 28, 28], f16), [128, 40, 28, 28], [31360, 784, 28, 1]), {})
cnt: 2, ((T([128, 24, 56, 56], f16), [128, 24, 56, 56], [75264, 3136, 56, 1]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), [128, 16, 112, 112], [200704, 12544, 112, 1]), {})
Operator: aten.new_zeros.default
cnt: 5, ((T([128, 160, 7, 7], f16), [1003520]), {})
cnt: 2, ((T([128, 112, 14, 14], f16), [2809856]), {})
cnt: 4, ((T([128, 80, 14, 14], f16), [2007040]), {})
cnt: 2, ((T([128, 40, 28, 28], f16), [4014080]), {})
cnt: 2, ((T([128, 24, 56, 56], f16), [9633792]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), [25690112]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 1, ((T([128, 16, 112, 112], f16),), {})
cnt: 2, ((T([128, 8, 112, 112], f16),), {})
cnt: 2, ((T([128, 24, 112, 112], f16),), {})
cnt: 4, ((T([128, 36, 56, 56], f16),), {})
cnt: 1, ((T([128, 20, 1, 1], f16),), {})
cnt: 2, ((T([128, 60, 28, 28], f16),), {})
cnt: 1, ((T([128, 32, 1, 1], f16),), {})
cnt: 2, ((T([128, 120, 28, 28], f16),), {})
cnt: 2, ((T([128, 100, 14, 14], f16),), {})
cnt: 4, ((T([128, 92, 14, 14], f16),), {})
cnt: 2, ((T([128, 240, 14, 14], f16),), {})
cnt: 1, ((T([128, 120, 1, 1], f16),), {})
cnt: 4, ((T([128, 336, 14, 14], f16),), {})
cnt: 2, ((T([128, 168, 1, 1], f16),), {})
cnt: 8, ((T([128, 480, 7, 7], f16),), {})
cnt: 2, ((T([128, 240, 1, 1], f16),), {})
cnt: 1, ((T([128, 960, 7, 7], f16),), {})
cnt: 1, ((T([128, 1280, 1, 1], f16),), {})
Operator: aten.slice_backward.default
cnt: 4, ((T([128, 960, 7, 7], f16), [128, 960, 7, 7], 3, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 960, 7, 7], f16), [128, 960, 7, 7], 2, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([128, 960, 7, 7], f16), [128, 960, 7, 7], 0, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 672, 14, 14], f16), [128, 672, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 672, 14, 14], f16), [128, 672, 14, 14], 2, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 672, 14, 14], f16), [128, 672, 14, 14], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 480, 14, 14], f16), [128, 480, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 480, 14, 14], f16), [128, 480, 14, 14], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 480, 14, 14], f16), [128, 480, 14, 14], 0, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 184, 14, 14], f16), [128, 184, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 184, 14, 14], f16), [128, 184, 14, 14], 2, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 184, 14, 14], f16), [128, 184, 14, 14], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 200, 14, 14], f16), [128, 200, 14, 14], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 200, 14, 14], f16), [128, 200, 14, 14], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 200, 14, 14], f16), [128, 200, 14, 14], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 240, 28, 28], f16), [128, 240, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 240, 28, 28], f16), [128, 240, 28, 28], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 240, 28, 28], f16), [128, 240, 28, 28], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 120, 28, 28], f16), [128, 120, 28, 28], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 120, 28, 28], f16), [128, 120, 28, 28], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 120, 28, 28], f16), [128, 120, 28, 28], 0, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 72, 56, 56], f16), [128, 72, 56, 56], 3, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 72, 56, 56], f16), [128, 72, 56, 56], 2, 0, 9223372036854775807, 1), {})
cnt: 2, ((T([128, 72, 56, 56], f16), [128, 72, 56, 56], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 48, 112, 112], f16), [128, 48, 112, 112], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 48, 112, 112], f16), [128, 48, 112, 112], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 48, 112, 112], f16), [128, 48, 112, 112], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 16, 112, 112], f16), [128, 16, 112, 112], 3, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 16, 112, 112], f16), [128, 16, 112, 112], 2, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 16, 112, 112], f16), [128, 16, 112, 112], 0, 0, 9223372036854775807, 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
cnt: 2, ((T([128, 960, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 672, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 672, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([128, 480, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([128, 120, 28, 28], f16), [2, 3], True), {})
cnt: 1, ((T([128, 72, 28, 28], f16), [2, 3], True), {})
Operator: aten.threshold_backward.default
cnt: 1, ((T([128, 1280, 1, 1], f16), T([128, 1280, 1, 1], f16), 0), {})
cnt: 1, ((T([128, 960, 7, 7], f16), T([128, 960, 7, 7], f16), 0), {})
cnt: 2, ((T([128, 240, 1, 1], f16), T([128, 240, 1, 1], f16), 0), {})
cnt: 4, ((T([128, 480, 7, 7], f16, stride=(47040, 49, 7, 1)), T([128, 480, 7, 7], f16), 0), {})
cnt: 4, ((T([128, 480, 7, 7], f16), T([128, 480, 7, 7], f16), 0), {})
cnt: 2, ((T([128, 168, 1, 1], f16), T([128, 168, 1, 1], f16), 0), {})
cnt: 2, ((T([128, 336, 14, 14], f16, stride=(131712, 196, 14, 1)), T([128, 336, 14, 14], f16), 0), {})
cnt: 2, ((T([128, 336, 14, 14], f16), T([128, 336, 14, 14], f16), 0), {})
cnt: 1, ((T([128, 120, 1, 1], f16), T([128, 120, 1, 1], f16), 0), {})
cnt: 1, ((T([128, 240, 14, 14], f16, stride=(94080, 196, 14, 1)), T([128, 240, 14, 14], f16), 0), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([128, 240, 14, 14], f16), 0), {})
cnt: 2, ((T([128, 92, 14, 14], f16, stride=(36064, 196, 14, 1)), T([128, 92, 14, 14], f16), 0), {})
cnt: 2, ((T([128, 92, 14, 14], f16), T([128, 92, 14, 14], f16), 0), {})
cnt: 1, ((T([128, 100, 14, 14], f16, stride=(39200, 196, 14, 1)), T([128, 100, 14, 14], f16), 0), {})
cnt: 1, ((T([128, 100, 14, 14], f16), T([128, 100, 14, 14], f16), 0), {})
cnt: 1, ((T([128, 120, 28, 28], f16, stride=(188160, 784, 28, 1)), T([128, 120, 28, 28], f16), 0), {})
cnt: 1, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16), 0), {})
cnt: 1, ((T([128, 32, 1, 1], f16), T([128, 32, 1, 1], f16), 0), {})
cnt: 1, ((T([128, 60, 28, 28], f16, stride=(94080, 784, 28, 1)), T([128, 60, 28, 28], f16), 0), {})
cnt: 1, ((T([128, 60, 28, 28], f16), T([128, 60, 28, 28], f16), 0), {})
cnt: 1, ((T([128, 20, 1, 1], f16), T([128, 20, 1, 1], f16), 0), {})
cnt: 2, ((T([128, 36, 56, 56], f16, stride=(225792, 3136, 56, 1)), T([128, 36, 56, 56], f16), 0), {})
cnt: 2, ((T([128, 36, 56, 56], f16), T([128, 36, 56, 56], f16), 0), {})
cnt: 1, ((T([128, 24, 112, 112], f16, stride=(602112, 12544, 112, 1)), T([128, 24, 112, 112], f16), 0), {})
cnt: 1, ((T([128, 24, 112, 112], f16), T([128, 24, 112, 112], f16), 0), {})
cnt: 1, ((T([128, 8, 112, 112], f16, stride=(200704, 12544, 112, 1)), T([128, 8, 112, 112], f16), 0), {})
cnt: 1, ((T([128, 8, 112, 112], f16), T([128, 8, 112, 112], f16), 0), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), 0), {})

View File

@ -0,0 +1,239 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 4, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16)), {})
cnt: 3, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16)), {})
cnt: 3, ((T([128, 1280, 8, 8], f16), T([128, 1280, 8, 8], f16)), {})
cnt: 14, ((T([128, 768, 17, 17], f16), T([128, 768, 17, 17], f16)), {})
cnt: 5, ((T([128, 288, 35, 35], f16), T([128, 288, 35, 35], f16)), {})
cnt: 3, ((T([128, 256, 35, 35], f16), T([128, 256, 35, 35], f16)), {})
cnt: 3, ((T([128, 192, 35, 35], f16), T([128, 192, 35, 35], f16)), {})
Operator: aten.add_.Tensor
cnt: 94, ((T([], i64), 1), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 2048], f16), T([2048, 1000], f16, stride=(1, 2048))), {})
Operator: aten.avg_pool2d.default
cnt: 1, ((T([128, 192, 35, 35], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 1, ((T([128, 256, 35, 35], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 1, ((T([128, 288, 35, 35], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 4, ((T([128, 768, 17, 17], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), [3, 3], [1, 1], [1, 1]), {})
Operator: aten.avg_pool2d_backward.default
cnt: 1, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([128, 1280, 8, 8], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 4, ((T([128, 768, 17, 17], f16), T([128, 768, 17, 17], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 1, ((T([128, 288, 35, 35], f16), T([128, 288, 35, 35], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 1, ((T([128, 256, 35, 35], f16), T([128, 256, 35, 35], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 1, ((T([128, 192, 35, 35], f16), T([128, 192, 35, 35], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
Operator: aten.cat.default
cnt: 1, (([T([128, 64, 35, 35], f16), T([128, 64, 35, 35], f16), T([128, 96, 35, 35], f16), T([128, 32, 35, 35], f16)], 1), {})
cnt: 2, (([T([128, 64, 35, 35], f16), T([128, 64, 35, 35], f16), T([128, 96, 35, 35], f16), T([128, 64, 35, 35], f16)], 1), {})
cnt: 1, (([T([128, 384, 17, 17], f16), T([128, 96, 17, 17], f16), T([128, 288, 17, 17], f16)], 1), {})
cnt: 4, (([T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16)], 1), {})
cnt: 1, (([T([128, 320, 8, 8], f16), T([128, 192, 8, 8], f16), T([128, 768, 8, 8], f16)], 1), {})
cnt: 4, (([T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16)], 1), {})
cnt: 2, (([T([128, 320, 8, 8], f16), T([128, 768, 8, 8], f16), T([128, 768, 8, 8], f16), T([128, 192, 8, 8], f16)], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 299, 299], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 299, 299], f16), T([32, 3, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 149, 149], f16), T([32, 32, 3, 3], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([64, 32, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 73, 73], f16), T([80, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([192, 80, 3, 3], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 192, 35, 35], f16), T([64, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 35, 35], f16), T([48, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 48, 35, 35], f16), T([64, 48, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 64, 35, 35], f16), T([96, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 96, 35, 35], f16), T([96, 96, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 35, 35], f16), T([32, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 256, 35, 35], f16), T([64, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 35, 35], f16), T([48, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 288, 35, 35], f16), T([64, 288, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 288, 35, 35], f16), T([48, 288, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 288, 35, 35], f16), T([384, 288, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 96, 35, 35], f16), T([96, 96, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 12, ((T([128, 768, 17, 17], f16), T([192, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 768, 17, 17], f16), T([128, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 128, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 17, 17], f16), T([192, 128, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 128, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 17, 17], f16), T([192, 128, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 768, 17, 17], f16), T([160, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([160, 160, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 160, 17, 17], f16), T([192, 160, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([160, 160, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 160, 17, 17], f16), T([192, 160, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 192, 17, 17], f16), T([192, 192, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 192, 17, 17], f16), T([192, 192, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 17, 17], f16), T([320, 192, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 17, 17], f16), T([192, 192, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([320, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([384, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([384, 384, 1, 3], f16), None, [1, 1], [0, 1], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([384, 384, 3, 1], f16), None, [1, 1], [1, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([448, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 448, 8, 8], f16), T([384, 448, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([192, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([320, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([384, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([448, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([192, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 192, 8, 8], f16), T([128, 2048, 8, 8], f16), T([192, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16), T([384, 384, 3, 1], f16), [0], [1, 1], [1, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16), T([384, 384, 1, 3], f16), [0], [1, 1], [0, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 384, 8, 8], f16), T([128, 448, 8, 8], f16), T([384, 448, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 448, 8, 8], f16), T([128, 2048, 8, 8], f16), T([448, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 384, 8, 8], f16), T([128, 2048, 8, 8], f16), T([384, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 320, 8, 8], f16), T([128, 2048, 8, 8], f16), T([320, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 8, 8], f16), T([128, 1280, 8, 8], f16), T([192, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 448, 8, 8], f16), T([128, 1280, 8, 8], f16), T([448, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 384, 8, 8], f16), T([128, 1280, 8, 8], f16), T([384, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 320, 8, 8], f16), T([128, 1280, 8, 8], f16), T([320, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 8, 8], f16), T([128, 192, 17, 17], f16), T([192, 192, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([192, 192, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([192, 192, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 12, ((T([128, 192, 17, 17], f16), T([128, 768, 17, 17], f16), T([192, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 320, 8, 8], f16), T([128, 192, 17, 17], f16), T([320, 192, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 192, 17, 17], f16), T([128, 160, 17, 17], f16), T([192, 160, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([128, 160, 17, 17], f16), T([160, 160, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([128, 160, 17, 17], f16), T([160, 160, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([128, 768, 17, 17], f16), T([160, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 192, 17, 17], f16), T([128, 160, 17, 17], f16), T([192, 160, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 17, 17], f16), T([128, 128, 17, 17], f16), T([192, 128, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 128, 17, 17], f16), T([128, 128, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 128, 17, 17], f16), T([128, 128, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 768, 17, 17], f16), T([128, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 17, 17], f16), T([128, 128, 17, 17], f16), T([192, 128, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 96, 17, 17], f16), T([128, 96, 35, 35], f16), T([96, 96, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 96, 35, 35], f16), T([128, 64, 35, 35], f16), T([96, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 64, 35, 35], f16), T([128, 288, 35, 35], f16), T([64, 288, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 384, 17, 17], f16), T([128, 288, 35, 35], f16), T([384, 288, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 96, 35, 35], f16), T([128, 96, 35, 35], f16), T([96, 96, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 64, 35, 35], f16), T([128, 48, 35, 35], f16), T([64, 48, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 48, 35, 35], f16), T([128, 288, 35, 35], f16), T([48, 288, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 64, 35, 35], f16), T([128, 256, 35, 35], f16), T([64, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 48, 35, 35], f16), T([128, 256, 35, 35], f16), T([48, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 35, 35], f16), T([128, 192, 35, 35], f16), T([32, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 64, 35, 35], f16), T([128, 192, 35, 35], f16), T([64, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 48, 35, 35], f16), T([128, 192, 35, 35], f16), T([48, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 71, 71], f16), T([128, 80, 73, 73], f16), T([192, 80, 3, 3], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([128, 64, 73, 73], f16), T([80, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 147, 147], f16), T([128, 32, 147, 147], f16), T([64, 32, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([128, 32, 149, 149], f16), T([32, 32, 3, 3], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 149, 149], f16), T([128, 3, 299, 299], f16), T([32, 3, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 299, 299], f16), T([128, 3, 299, 299], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 2048, 8, 8], f16, stride=(2048, 1, 0, 0)), 64), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([128, 64, 147, 147], f16), [3, 3], [2, 2]), {})
cnt: 1, ((T([128, 192, 71, 71], f16), [3, 3], [2, 2]), {})
cnt: 1, ((T([128, 288, 35, 35], f16), [3, 3], [2, 2]), {})
cnt: 1, ((T([128, 768, 17, 17], f16), [3, 3], [2, 2]), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([128, 768, 8, 8], f16, stride=(81920, 64, 8, 1)), T([128, 768, 17, 17], f16), [3, 3], [2, 2], [0, 0], [1, 1], False, T([128, 768, 8, 8], i64)), {})
cnt: 1, ((T([128, 288, 17, 17], f16, stride=(221952, 289, 17, 1)), T([128, 288, 35, 35], f16), [3, 3], [2, 2], [0, 0], [1, 1], False, T([128, 288, 17, 17], i64)), {})
cnt: 1, ((T([128, 192, 35, 35], f16), T([128, 192, 71, 71], f16), [3, 3], [2, 2], [0, 0], [1, 1], False, T([128, 192, 35, 35], i64)), {})
cnt: 1, ((T([128, 64, 73, 73], f16), T([128, 64, 147, 147], f16), [3, 3], [2, 2], [0, 0], [1, 1], False, T([128, 64, 73, 73], i64)), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 2048, 8, 8], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 2048], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 2048], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([128, 32, 149, 149], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 64, 147, 147], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 192, 71, 71], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 0.001), {})
cnt: 12, ((T([128, 64, 35, 35], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 0.001), {})
cnt: 3, ((T([128, 48, 35, 35], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f16), True, 0.1, 0.001), {})
cnt: 7, ((T([128, 96, 35, 35], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 32, 35, 35], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 384, 17, 17], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 96, 17, 17], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f16), True, 0.1, 0.001), {})
cnt: 26, ((T([128, 192, 17, 17], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 0.001), {})
cnt: 6, ((T([128, 128, 17, 17], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 0.001), {})
cnt: 12, ((T([128, 160, 17, 17], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f16), True, 0.1, 0.001), {})
cnt: 3, ((T([128, 320, 8, 8], f16), T([320], f16), T([320], f16), T([320], f16), T([320], f16), True, 0.1, 0.001), {})
cnt: 3, ((T([128, 192, 8, 8], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 0.001), {})
cnt: 12, ((T([128, 384, 8, 8], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f16), True, 0.1, 0.001), {})
cnt: 2, ((T([128, 448, 8, 8], f16), T([448], f16), T([448], f16), T([448], f16), T([448], f16), True, 0.1, 0.001), {})
Operator: aten.native_batch_norm_backward.default
cnt: 3, ((T([128, 192, 8, 8], f16), T([128, 192, 8, 8], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 0.001, [True, True, True]), {})
cnt: 12, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f32), T([384], f32), True, 0.001, [True, True, True]), {})
cnt: 2, ((T([128, 448, 8, 8], f16), T([128, 448, 8, 8], f16), T([448], f16), T([448], f16), T([448], f16), T([448], f32), T([448], f32), True, 0.001, [True, True, True]), {})
cnt: 3, ((T([128, 320, 8, 8], f16), T([128, 320, 8, 8], f16), T([320], f16), T([320], f16), T([320], f16), T([320], f32), T([320], f32), True, 0.001, [True, True, True]), {})
cnt: 26, ((T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 0.001, [True, True, True]), {})
cnt: 12, ((T([128, 160, 17, 17], f16), T([128, 160, 17, 17], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f32), T([160], f32), True, 0.001, [True, True, True]), {})
cnt: 6, ((T([128, 128, 17, 17], f16), T([128, 128, 17, 17], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 96, 17, 17], f16), T([128, 96, 17, 17], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f32), T([96], f32), True, 0.001, [True, True, True]), {})
cnt: 7, ((T([128, 96, 35, 35], f16), T([128, 96, 35, 35], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f32), T([96], f32), True, 0.001, [True, True, True]), {})
cnt: 12, ((T([128, 64, 35, 35], f16), T([128, 64, 35, 35], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 384, 17, 17], f16), T([128, 384, 17, 17], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f32), T([384], f32), True, 0.001, [True, True, True]), {})
cnt: 3, ((T([128, 48, 35, 35], f16), T([128, 48, 35, 35], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f32), T([48], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 32, 35, 35], f16), T([128, 32, 35, 35], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 192, 71, 71], f16), T([128, 192, 71, 71], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([128, 80, 73, 73], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f32), T([80], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 64, 147, 147], f16), T([128, 64, 147, 147], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([128, 32, 147, 147], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 32, 149, 149], f16), T([128, 32, 149, 149], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 0.001, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 1, ((T([128, 32, 149, 149], f16),), {})
cnt: 1, ((T([128, 32, 147, 147], f16),), {})
cnt: 1, ((T([128, 64, 147, 147], f16),), {})
cnt: 1, ((T([128, 80, 73, 73], f16),), {})
cnt: 1, ((T([128, 192, 71, 71], f16),), {})
cnt: 12, ((T([128, 64, 35, 35], f16),), {})
cnt: 3, ((T([128, 48, 35, 35], f16),), {})
cnt: 7, ((T([128, 96, 35, 35], f16),), {})
cnt: 1, ((T([128, 32, 35, 35], f16),), {})
cnt: 1, ((T([128, 384, 17, 17], f16),), {})
cnt: 1, ((T([128, 96, 17, 17], f16),), {})
cnt: 26, ((T([128, 192, 17, 17], f16),), {})
cnt: 6, ((T([128, 128, 17, 17], f16),), {})
cnt: 12, ((T([128, 160, 17, 17], f16),), {})
cnt: 3, ((T([128, 320, 8, 8], f16),), {})
cnt: 3, ((T([128, 192, 8, 8], f16),), {})
cnt: 12, ((T([128, 384, 8, 8], f16),), {})
cnt: 2, ((T([128, 448, 8, 8], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 2, ((T([128, 192, 8, 8], f16, stride=(131072, 64, 8, 1)), T([128, 192, 8, 8], f16), 0), {})
cnt: 8, ((T([128, 384, 8, 8], f16, stride=(131072, 64, 8, 1)), T([128, 384, 8, 8], f16), 0), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16), 0), {})
cnt: 2, ((T([128, 448, 8, 8], f16), T([128, 448, 8, 8], f16), 0), {})
cnt: 2, ((T([128, 320, 8, 8], f16, stride=(131072, 64, 8, 1)), T([128, 320, 8, 8], f16), 0), {})
cnt: 1, ((T([128, 192, 8, 8], f16, stride=(81920, 64, 8, 1)), T([128, 192, 8, 8], f16), 0), {})
cnt: 10, ((T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), 0), {})
cnt: 1, ((T([128, 320, 8, 8], f16, stride=(81920, 64, 8, 1)), T([128, 320, 8, 8], f16), 0), {})
cnt: 16, ((T([128, 192, 17, 17], f16, stride=(221952, 289, 17, 1)), T([128, 192, 17, 17], f16), 0), {})
cnt: 12, ((T([128, 160, 17, 17], f16), T([128, 160, 17, 17], f16), 0), {})
cnt: 6, ((T([128, 128, 17, 17], f16), T([128, 128, 17, 17], f16), 0), {})
cnt: 1, ((T([128, 96, 17, 17], f16, stride=(221952, 289, 17, 1)), T([128, 96, 17, 17], f16), 0), {})
cnt: 4, ((T([128, 96, 35, 35], f16), T([128, 96, 35, 35], f16), 0), {})
cnt: 4, ((T([128, 64, 35, 35], f16), T([128, 64, 35, 35], f16), 0), {})
cnt: 1, ((T([128, 384, 17, 17], f16, stride=(221952, 289, 17, 1)), T([128, 384, 17, 17], f16), 0), {})
cnt: 6, ((T([128, 64, 35, 35], f16, stride=(352800, 1225, 35, 1)), T([128, 64, 35, 35], f16), 0), {})
cnt: 2, ((T([128, 96, 35, 35], f16, stride=(352800, 1225, 35, 1)), T([128, 96, 35, 35], f16), 0), {})
cnt: 3, ((T([128, 48, 35, 35], f16), T([128, 48, 35, 35], f16), 0), {})
cnt: 1, ((T([128, 32, 35, 35], f16, stride=(313600, 1225, 35, 1)), T([128, 32, 35, 35], f16), 0), {})
cnt: 1, ((T([128, 96, 35, 35], f16, stride=(313600, 1225, 35, 1)), T([128, 96, 35, 35], f16), 0), {})
cnt: 2, ((T([128, 64, 35, 35], f16, stride=(313600, 1225, 35, 1)), T([128, 64, 35, 35], f16), 0), {})
cnt: 1, ((T([128, 192, 71, 71], f16), T([128, 192, 71, 71], f16), 0), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([128, 80, 73, 73], f16), 0), {})
cnt: 1, ((T([128, 64, 147, 147], f16), T([128, 64, 147, 147], f16), 0), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([128, 32, 147, 147], f16), 0), {})
cnt: 1, ((T([128, 32, 149, 149], f16), T([128, 32, 149, 149], f16), 0), {})

View File

@ -0,0 +1,187 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([32, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([32, 1000], f16), T([32, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 5, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16)), {})
cnt: 72, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16)), {})
cnt: 16, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16)), {})
cnt: 6, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16)), {})
cnt: 1, ((T([32, 128, 56, 56], f16), T([32, 128, 56, 56], f16)), {})
Operator: aten.add_.Tensor
cnt: 157, ((T([], i64), 1), {})
cnt: 3, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16)), {})
cnt: 8, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16)), {})
cnt: 36, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16)), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([32, 2048], f16), T([2048, 1000], f16, stride=(1, 2048))), {})
Operator: aten.clone.default
cnt: 1, ((T([32, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([32, 3, 224, 224], f16), T([64, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 64, 112, 112], f16), T([64, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 64, 112, 112], f16), T([128, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 128, 56, 56], f16), T([128, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 128, 56, 56], f16), T([256, 2, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 4, ((T([32, 256, 56, 56], f16), T([256, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 256, 1, 1], f16), T([16, 256, 1, 1], f16), T([16], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 16, 1, 1], f16), T([256, 16, 1, 1], f16), T([256], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 128, 56, 56], f16), T([256, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([32, 256, 56, 56], f16), T([128, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 256, 56, 56], f16), T([512, 4, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 9, ((T([32, 512, 28, 28], f16), T([512, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([32, 512, 1, 1], f16), T([32, 512, 1, 1], f16), T([32], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([32, 32, 1, 1], f16), T([512, 32, 1, 1], f16), T([512], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 256, 56, 56], f16), T([512, 256, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 7, ((T([32, 512, 28, 28], f16), T([256, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 7, ((T([32, 256, 28, 28], f16), T([512, 4, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([1024, 8, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 37, ((T([32, 1024, 14, 14], f16), T([1024, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 36, ((T([32, 1024, 1, 1], f16), T([64, 1024, 1, 1], f16), T([64], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 36, ((T([32, 64, 1, 1], f16), T([1024, 64, 1, 1], f16), T([1024], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([1024, 512, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 35, ((T([32, 1024, 14, 14], f16), T([512, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 35, ((T([32, 512, 14, 14], f16), T([1024, 8, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([2048, 16, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), T([2048, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 2048, 1, 1], f16), T([128, 2048, 1, 1], f16), T([128], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 128, 1, 1], f16), T([2048, 128, 1, 1], f16), T([2048], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([2048, 1024, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([32, 2048, 7, 7], f16), T([1024, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([32, 1024, 7, 7], f16), T([2048, 16, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 64), {})
Operator: aten.convolution_backward.default
cnt: 3, ((T([32, 2048, 1, 1], f16), T([32, 128, 1, 1], f16), T([2048, 128, 1, 1], f16), [2048], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([32, 128, 1, 1], f16), T([32, 2048, 1, 1], f16), T([128, 2048, 1, 1], f16), [128], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16), T([2048, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([32, 2048, 7, 7], f16), T([32, 1024, 7, 7], f16), T([2048, 16, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 2, ((T([32, 1024, 7, 7], f16), T([32, 2048, 7, 7], f16), T([1024, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 2048, 7, 7], f16), T([32, 1024, 14, 14], f16), T([2048, 1024, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 2048, 7, 7], f16), T([32, 1024, 14, 14], f16), T([2048, 16, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 37, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16), T([1024, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 36, ((T([32, 1024, 1, 1], f16), T([32, 64, 1, 1], f16), T([1024, 64, 1, 1], f16), [1024], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 36, ((T([32, 64, 1, 1], f16), T([32, 1024, 1, 1], f16), T([64, 1024, 1, 1], f16), [64], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 35, ((T([32, 1024, 14, 14], f16), T([32, 512, 14, 14], f16), T([1024, 8, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 35, ((T([32, 512, 14, 14], f16), T([32, 1024, 14, 14], f16), T([512, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 512, 28, 28], f16), T([1024, 512, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 512, 28, 28], f16), T([1024, 8, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 9, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16), T([512, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 8, ((T([32, 512, 1, 1], f16), T([32, 32, 1, 1], f16), T([512, 32, 1, 1], f16), [512], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 8, ((T([32, 32, 1, 1], f16), T([32, 512, 1, 1], f16), T([32, 512, 1, 1], f16), [32], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 7, ((T([32, 512, 28, 28], f16), T([32, 256, 28, 28], f16), T([512, 4, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 7, ((T([32, 256, 28, 28], f16), T([32, 512, 28, 28], f16), T([256, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([32, 256, 56, 56], f16), T([512, 256, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([32, 256, 56, 56], f16), T([512, 4, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 4, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16), T([256, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([32, 256, 1, 1], f16), T([32, 16, 1, 1], f16), T([256, 16, 1, 1], f16), [256], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([32, 16, 1, 1], f16), T([32, 256, 1, 1], f16), T([16, 256, 1, 1], f16), [16], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([32, 256, 56, 56], f16), T([32, 128, 56, 56], f16), T([256, 2, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 2, ((T([32, 128, 56, 56], f16), T([32, 256, 56, 56], f16), T([128, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 256, 56, 56], f16), T([32, 128, 56, 56], f16), T([256, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 128, 56, 56], f16), T([32, 128, 56, 56], f16), T([128, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 128, 112, 112], f16), T([32, 64, 112, 112], f16), T([128, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 64, 112, 112], f16), T([32, 64, 112, 112], f16), T([64, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 64, 112, 112], f16), T([32, 3, 224, 224], f16), T([64, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([32, 3, 224, 224], f16), T([32, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 4, ((T([32, 2048, 7, 7], f16, stride=(2048, 1, 0, 0)), 49), {})
cnt: 36, ((T([32, 1024, 14, 14], f16, stride=(1024, 1, 0, 0)), 196), {})
cnt: 8, ((T([32, 512, 28, 28], f16, stride=(512, 1, 0, 0)), 784), {})
cnt: 3, ((T([32, 256, 56, 56], f16, stride=(256, 1, 0, 0)), 3136), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([32], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([32, 128, 112, 112], f16), [3, 3], [2, 2], [1, 1]), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([32, 128, 56, 56], f16), T([32, 128, 112, 112], f16), [3, 3], [2, 2], [1, 1], [1, 1], False, T([32, 128, 56, 56], i64)), {})
Operator: aten.mean.dim
cnt: 3, ((T([32, 256, 56, 56], f16), [2, 3], True), {})
cnt: 8, ((T([32, 512, 28, 28], f16), [2, 3], True), {})
cnt: 36, ((T([32, 1024, 14, 14], f16), [2, 3], True), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([32, 2048, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([32, 1000], f16), T([1000, 2048], f16)), {})
cnt: 1, ((T([1000, 32], f16, stride=(1, 1000)), T([32, 2048], f16)), {})
Operator: aten.mul.Tensor
cnt: 6, ((T([32, 256, 56, 56], f16), T([32, 256, 1, 1], f16)), {})
cnt: 16, ((T([32, 512, 28, 28], f16), T([32, 512, 1, 1], f16)), {})
cnt: 72, ((T([32, 1024, 14, 14], f16), T([32, 1024, 1, 1], f16)), {})
cnt: 6, ((T([32, 2048, 7, 7], f16), T([32, 2048, 1, 1], f16)), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16)), {})
cnt: 36, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16)), {})
cnt: 8, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16)), {})
cnt: 3, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 2, ((T([32, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([32, 128, 112, 112], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([32, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 8, ((T([32, 256, 56, 56], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 18, ((T([32, 512, 28, 28], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 7, ((T([32, 256, 28, 28], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 74, ((T([32, 1024, 14, 14], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
cnt: 35, ((T([32, 512, 14, 14], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 7, ((T([32, 2048, 7, 7], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([32, 1024, 7, 7], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 7, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f32), T([2048], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([32, 1024, 7, 7], f16), T([32, 1024, 7, 7], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 74, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 35, ((T([32, 512, 14, 14], f16), T([32, 512, 14, 14], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 18, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 7, ((T([32, 256, 28, 28], f16), T([32, 256, 28, 28], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 8, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([32, 128, 56, 56], f16), T([32, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([32, 128, 112, 112], f16), T([32, 128, 112, 112], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([32, 64, 112, 112], f16), T([32, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([32, 1000], f16), T([32], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([32, 1000], f16), T([32], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 2, ((T([32, 64, 112, 112], f16),), {})
cnt: 1, ((T([32, 128, 112, 112], f16),), {})
cnt: 3, ((T([32, 128, 56, 56], f16),), {})
cnt: 7, ((T([32, 256, 56, 56], f16),), {})
cnt: 3, ((T([32, 16, 1, 1], f16),), {})
cnt: 17, ((T([32, 512, 28, 28], f16),), {})
cnt: 8, ((T([32, 32, 1, 1], f16),), {})
cnt: 7, ((T([32, 256, 28, 28], f16),), {})
cnt: 73, ((T([32, 1024, 14, 14], f16),), {})
cnt: 36, ((T([32, 64, 1, 1], f16),), {})
cnt: 35, ((T([32, 512, 14, 14], f16),), {})
cnt: 6, ((T([32, 2048, 7, 7], f16),), {})
cnt: 3, ((T([32, 128, 1, 1], f16),), {})
cnt: 2, ((T([32, 1024, 7, 7], f16),), {})
Operator: aten.sigmoid.default
cnt: 3, ((T([32, 256, 1, 1], f16),), {})
cnt: 8, ((T([32, 512, 1, 1], f16),), {})
cnt: 36, ((T([32, 1024, 1, 1], f16),), {})
cnt: 3, ((T([32, 2048, 1, 1], f16),), {})
Operator: aten.sigmoid_backward.default
cnt: 3, ((T([32, 2048, 1, 1], f16), T([32, 2048, 1, 1], f16)), {})
cnt: 36, ((T([32, 1024, 1, 1], f16), T([32, 1024, 1, 1], f16)), {})
cnt: 8, ((T([32, 512, 1, 1], f16), T([32, 512, 1, 1], f16)), {})
cnt: 3, ((T([32, 256, 1, 1], f16), T([32, 256, 1, 1], f16)), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([32, 1000], f16), [0], True), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), [2, 3], True), {})
cnt: 36, ((T([32, 1024, 14, 14], f16), [2, 3], True), {})
cnt: 8, ((T([32, 512, 28, 28], f16), [2, 3], True), {})
cnt: 3, ((T([32, 256, 56, 56], f16), [2, 3], True), {})
Operator: aten.threshold_backward.default
cnt: 6, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16), 0), {})
cnt: 3, ((T([32, 128, 1, 1], f16), T([32, 128, 1, 1], f16), 0), {})
cnt: 2, ((T([32, 1024, 7, 7], f16), T([32, 1024, 7, 7], f16), 0), {})
cnt: 73, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16), 0), {})
cnt: 36, ((T([32, 64, 1, 1], f16), T([32, 64, 1, 1], f16), 0), {})
cnt: 35, ((T([32, 512, 14, 14], f16), T([32, 512, 14, 14], f16), 0), {})
cnt: 17, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16), 0), {})
cnt: 8, ((T([32, 32, 1, 1], f16), T([32, 32, 1, 1], f16), 0), {})
cnt: 7, ((T([32, 256, 28, 28], f16), T([32, 256, 28, 28], f16), 0), {})
cnt: 7, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16), 0), {})
cnt: 3, ((T([32, 16, 1, 1], f16), T([32, 16, 1, 1], f16), 0), {})
cnt: 3, ((T([32, 128, 56, 56], f16), T([32, 128, 56, 56], f16), 0), {})
cnt: 1, ((T([32, 128, 112, 112], f16), T([32, 128, 112, 112], f16), 0), {})
cnt: 2, ((T([32, 64, 112, 112], f16), T([32, 64, 112, 112], f16), 0), {})

View File

@ -0,0 +1,155 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([32, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([32, 1000], f16), T([32, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 2, ((T([32, 128, 75, 75], f16), T([32, 128, 75, 75], f16)), {})
cnt: 2, ((T([32, 256, 38, 38], f16), T([32, 256, 38, 38], f16)), {})
cnt: 34, ((T([32, 728, 19, 19], f16), T([32, 728, 19, 19], f16)), {})
cnt: 1, ((T([32, 1024, 10, 10], f16), T([32, 1024, 10, 10], f16)), {})
cnt: 1, ((T([32, 64, 150, 150], f16), T([32, 64, 150, 150], f16)), {})
Operator: aten.add_.Tensor
cnt: 132, ((T([], i64), 1), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([32, 2048], f16), T([2048, 1000], f16, stride=(1, 2048))), {})
Operator: aten.clone.default
cnt: 1, ((T([32, 3, 299, 299], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([32, 3, 299, 299], f16), T([32, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 32, 150, 150], f16), T([64, 32, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 64, 150, 150], f16), T([128, 64, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 64, 150, 150], f16), T([64, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 1, ((T([32, 64, 150, 150], f16), T([128, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 128, 150, 150], f16), T([128, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 128), {})
cnt: 1, ((T([32, 128, 150, 150], f16), T([128, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 128, 150, 150], f16), T([128, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 128), {})
cnt: 1, ((T([32, 128, 75, 75], f16), T([128, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 128, 75, 75], f16), T([256, 128, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 128, 75, 75], f16), T([128, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 128), {})
cnt: 1, ((T([32, 128, 75, 75], f16), T([256, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 256, 75, 75], f16), T([256, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 256), {})
cnt: 1, ((T([32, 256, 75, 75], f16), T([256, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 256, 75, 75], f16), T([256, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 256), {})
cnt: 1, ((T([32, 256, 38, 38], f16), T([256, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 256, 38, 38], f16), T([728, 256, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 256, 38, 38], f16), T([256, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 256), {})
cnt: 1, ((T([32, 256, 38, 38], f16), T([728, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 728, 38, 38], f16), T([728, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 728), {})
cnt: 1, ((T([32, 728, 38, 38], f16), T([728, 728, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 728, 38, 38], f16), T([728, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 728), {})
cnt: 50, ((T([32, 728, 19, 19], f16), T([728, 728, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 50, ((T([32, 728, 19, 19], f16), T([728, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 728), {})
cnt: 1, ((T([32, 728, 19, 19], f16), T([1024, 728, 1, 1], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 728, 19, 19], f16), T([1024, 728, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1024, 19, 19], f16), T([1024, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1024), {})
cnt: 1, ((T([32, 1024, 10, 10], f16), T([1024, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1024, 10, 10], f16), T([1024, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1024), {})
cnt: 1, ((T([32, 1024, 10, 10], f16), T([1536, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([32, 1536, 10, 10], f16), T([1536, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1536), {})
cnt: 1, ((T([32, 1536, 10, 10], f16), T([1536, 1536, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1536, 10, 10], f16), T([2048, 1536, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([32, 2048, 10, 10], f16), T([32, 1536, 10, 10], f16), T([2048, 1536, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([32, 1536, 10, 10], f16), T([32, 1536, 10, 10], f16), T([1536, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1536, [True, True, False]), {})
cnt: 1, ((T([32, 1536, 10, 10], f16), T([32, 1536, 10, 10], f16), T([1536, 1536, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 1536, 10, 10], f16), T([32, 1024, 10, 10], f16), T([1536, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 1024, 10, 10], f16), T([32, 1024, 10, 10], f16), T([1024, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1024, [True, True, False]), {})
cnt: 1, ((T([32, 1024, 10, 10], f16), T([32, 1024, 10, 10], f16), T([1024, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 1024, 10, 10], f16), T([32, 1024, 19, 19], f16), T([1024, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1024, [True, True, False]), {})
cnt: 1, ((T([32, 1024, 19, 19], f16), T([32, 728, 19, 19], f16), T([1024, 728, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 50, ((T([32, 728, 19, 19], f16), T([32, 728, 19, 19], f16), T([728, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 728, [True, True, False]), {})
cnt: 50, ((T([32, 728, 19, 19], f16), T([32, 728, 19, 19], f16), T([728, 728, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 1024, 10, 10], f16), T([32, 728, 19, 19], f16), T([1024, 728, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 728, 19, 19], f16), T([32, 728, 38, 38], f16), T([728, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 728, [True, True, False]), {})
cnt: 1, ((T([32, 728, 38, 38], f16), T([32, 728, 38, 38], f16), T([728, 728, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 728, 38, 38], f16), T([32, 728, 38, 38], f16), T([728, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 728, [True, True, False]), {})
cnt: 1, ((T([32, 728, 38, 38], f16), T([32, 256, 38, 38], f16), T([728, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 256, 38, 38], f16), T([32, 256, 38, 38], f16), T([256, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 256, [True, True, False]), {})
cnt: 1, ((T([32, 728, 19, 19], f16), T([32, 256, 38, 38], f16), T([728, 256, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 256, 38, 38], f16), T([32, 256, 38, 38], f16), T([256, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 256, 38, 38], f16), T([32, 256, 75, 75], f16), T([256, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 256, [True, True, False]), {})
cnt: 1, ((T([32, 256, 75, 75], f16), T([32, 256, 75, 75], f16), T([256, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 256, 75, 75], f16), T([32, 256, 75, 75], f16), T([256, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 256, [True, True, False]), {})
cnt: 1, ((T([32, 256, 75, 75], f16), T([32, 128, 75, 75], f16), T([256, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 128, 75, 75], f16), T([32, 128, 75, 75], f16), T([128, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 128, [True, True, False]), {})
cnt: 1, ((T([32, 256, 38, 38], f16), T([32, 128, 75, 75], f16), T([256, 128, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 128, 75, 75], f16), T([32, 128, 75, 75], f16), T([128, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 128, 75, 75], f16), T([32, 128, 150, 150], f16), T([128, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 128, [True, True, False]), {})
cnt: 1, ((T([32, 128, 150, 150], f16), T([32, 128, 150, 150], f16), T([128, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 128, 150, 150], f16), T([32, 128, 150, 150], f16), T([128, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 128, [True, True, False]), {})
cnt: 1, ((T([32, 128, 150, 150], f16), T([32, 64, 150, 150], f16), T([128, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 64, 150, 150], f16), T([32, 64, 150, 150], f16), T([64, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 1, ((T([32, 128, 75, 75], f16), T([32, 64, 150, 150], f16), T([128, 64, 1, 1], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 64, 150, 150], f16), T([32, 32, 150, 150], f16), T([64, 32, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 32, 150, 150], f16), T([32, 3, 299, 299], f16), T([32, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([32, 3, 299, 299], f16), T([32, 3, 299, 299], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([32, 2048, 10, 10], f16, stride=(2048, 1, 0, 0)), 100), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([32], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([32, 2048, 10, 10], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([32, 1000], f16), T([1000, 2048], f16)), {})
cnt: 1, ((T([1000, 32], f16, stride=(1, 1000)), T([32, 2048], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([32, 32, 150, 150], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([32, 64, 150, 150], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([32, 128, 75, 75], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([32, 128, 150, 150], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([32, 256, 38, 38], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([32, 256, 75, 75], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 102, ((T([32, 728, 19, 19], f16), T([728], f16), T([728], f16), T([728], f16), T([728], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([32, 728, 38, 38], f16), T([728], f16), T([728], f16), T([728], f16), T([728], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([32, 1024, 10, 10], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([32, 1024, 19, 19], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([32, 1536, 10, 10], f16), T([1536], f16), T([1536], f16), T([1536], f16), T([1536], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([32, 2048, 10, 10], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([32, 2048, 10, 10], f16), T([32, 2048, 10, 10], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f32), T([2048], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([32, 1536, 10, 10], f16), T([32, 1536, 10, 10], f16), T([1536], f16), T([1536], f16), T([1536], f16), T([1536], f32), T([1536], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([32, 1024, 10, 10], f16), T([32, 1024, 10, 10], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([32, 1024, 19, 19], f16), T([32, 1024, 19, 19], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 102, ((T([32, 728, 19, 19], f16), T([32, 728, 19, 19], f16), T([728], f16), T([728], f16), T([728], f16), T([728], f32), T([728], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([32, 728, 38, 38], f16), T([32, 728, 38, 38], f16), T([728], f16), T([728], f16), T([728], f16), T([728], f32), T([728], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([32, 256, 38, 38], f16), T([32, 256, 38, 38], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([32, 256, 75, 75], f16), T([32, 256, 75, 75], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([32, 128, 75, 75], f16), T([32, 128, 75, 75], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([32, 128, 150, 150], f16), T([32, 128, 150, 150], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([32, 64, 150, 150], f16), T([32, 64, 150, 150], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([32, 32, 150, 150], f16), T([32, 32, 150, 150], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([32, 1000], f16), T([32], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([32, 1000], f16), T([32], i64), None, 1, -100), {})
Operator: aten.relu.default
cnt: 1, ((T([32, 256, 38, 38], f16),), {})
cnt: 17, ((T([32, 728, 19, 19], f16),), {})
Operator: aten.relu_.default
cnt: 1, ((T([32, 32, 150, 150], f16),), {})
cnt: 1, ((T([32, 64, 150, 150], f16),), {})
cnt: 2, ((T([32, 128, 150, 150], f16),), {})
cnt: 1, ((T([32, 128, 75, 75], f16),), {})
cnt: 2, ((T([32, 256, 75, 75], f16),), {})
cnt: 2, ((T([32, 728, 38, 38], f16),), {})
cnt: 33, ((T([32, 728, 19, 19], f16),), {})
cnt: 1, ((T([32, 1024, 19, 19], f16),), {})
cnt: 1, ((T([32, 1024, 10, 10], f16),), {})
cnt: 2, ((T([32, 1536, 10, 10], f16),), {})
cnt: 1, ((T([32, 2048, 10, 10], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([32, 1000], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 1, ((T([32, 2048, 10, 10], f16), T([32, 2048, 10, 10], f16), 0), {})
cnt: 2, ((T([32, 1536, 10, 10], f16), T([32, 1536, 10, 10], f16), 0), {})
cnt: 1, ((T([32, 1024, 10, 10], f16), T([32, 1024, 10, 10], f16), 0), {})
cnt: 1, ((T([32, 1024, 19, 19], f16), T([32, 1024, 19, 19], f16), 0), {})
cnt: 50, ((T([32, 728, 19, 19], f16), T([32, 728, 19, 19], f16), 0), {})
cnt: 2, ((T([32, 728, 38, 38], f16), T([32, 728, 38, 38], f16), 0), {})
cnt: 1, ((T([32, 256, 38, 38], f16), T([32, 256, 38, 38], f16), 0), {})
cnt: 2, ((T([32, 256, 75, 75], f16), T([32, 256, 75, 75], f16), 0), {})
cnt: 1, ((T([32, 128, 75, 75], f16), T([32, 128, 75, 75], f16), 0), {})
cnt: 2, ((T([32, 128, 150, 150], f16), T([32, 128, 150, 150], f16), 0), {})
cnt: 1, ((T([32, 64, 150, 150], f16), T([32, 64, 150, 150], f16), 0), {})
cnt: 1, ((T([32, 32, 150, 150], f16), T([32, 32, 150, 150], f16), 0), {})

View File

@ -0,0 +1,83 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([64, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16), 1, f16), {})
Operator: aten._unsafe_view.default
cnt: 24, ((T([64, 384, 384], f16), [64, 384, 384]), {})
cnt: 24, ((T([64, 384, 196], f16), [24576, 196]), {})
Operator: aten.add.Tensor
cnt: 24, ((T([64, 384, 384], f16), T([384], f16)), {})
cnt: 24, ((T([64, 196, 384], f16, stride=(75264, 1, 196)), T([64, 196, 384], f16, stride=(75264, 1, 196))), {})
cnt: 24, ((T([64, 196, 384], f16, stride=(75264, 1, 196)), T([64, 196, 384], f16)), {})
cnt: 24, ((T([64, 196, 384], f16), T([64, 196, 384], f16)), {})
cnt: 24, ((T([64, 196, 384], f16), T([64, 196, 384], f16, stride=(75264, 1, 196))), {})
Operator: aten.addmm.default
cnt: 24, ((T([196], f16), T([24576, 192], f16), T([192, 196], f16, stride=(1, 192))), {})
cnt: 24, ((T([1536], f16), T([12544, 384], f16), T([384, 1536], f16, stride=(1, 384))), {})
cnt: 24, ((T([384], f16), T([12544, 768], f16), T([768, 384], f16, stride=(1, 768))), {})
cnt: 1, ((T([1000], f16), T([64, 384], f16), T([384, 1000], f16, stride=(1, 384))), {})
Operator: aten.bmm.default
cnt: 24, ((T([64, 384, 196], f16, stride=(75264, 1, 384)), T([64, 196, 384], f16, stride=(0, 1, 196))), {})
cnt: 24, ((T([64, 196, 384], f16), T([64, 384, 384], f16)), {})
cnt: 24, ((T([64, 384, 384], f16), T([64, 384, 196], f16, stride=(0, 196, 1))), {})
Operator: aten.cat.default
cnt: 24, (([T([64, 196, 768], f16), T([64, 196, 768], f16)], 2), {})
cnt: 24, (([T([64, 384, 192], f16), T([64, 384, 192], f16)], 2), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([384, 3, 16, 16], f16), T([384], f16), [16, 16], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([64, 384, 14, 14], f16, stride=(75264, 1, 5376, 384)), T([64, 3, 224, 224], f16), T([384, 3, 16, 16], f16), [384], [16, 16], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([64, 3, 224, 224], f16)), {})
cnt: 24, ((T([384, 196], f16), T([384, 196], f16, stride=(1, 384))), {})
Operator: aten.div.Scalar
cnt: 1, ((T([64, 196, 384], f16, stride=(384, 0, 1)), 196), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([64], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([64, 196, 384], f16), [1]), {})
Operator: aten.mm.default
cnt: 1, ((T([64, 1000], f16), T([1000, 384], f16)), {})
cnt: 1, ((T([1000, 64], f16, stride=(1, 1000)), T([64, 384], f16)), {})
cnt: 24, ((T([12544, 384], f16), T([384, 768], f16)), {})
cnt: 24, ((T([384, 12544], f16, stride=(1, 384)), T([12544, 768], f16)), {})
cnt: 24, ((T([12544, 1536], f16), T([1536, 384], f16)), {})
cnt: 24, ((T([1536, 12544], f16, stride=(1, 1536)), T([12544, 384], f16)), {})
cnt: 24, ((T([24576, 196], f16), T([196, 192], f16)), {})
cnt: 24, ((T([196, 24576], f16, stride=(1, 196)), T([24576, 192], f16)), {})
Operator: aten.mul.Tensor
cnt: 24, ((T([64, 384, 192], f16, stride=(147456, 384, 1)), T([64, 384, 192], f16)), {})
cnt: 24, ((T([64, 196, 768], f16, stride=(301056, 1536, 1)), T([64, 196, 768], f16)), {})
cnt: 24, ((T([64, 196, 768], f16), T([64, 196, 768], f16, stride=(301056, 1536, 1))), {})
cnt: 24, ((T([64, 196, 768], f16), T([64, 196, 768], f16)), {})
cnt: 24, ((T([64, 384, 192], f16), T([64, 384, 192], f16, stride=(147456, 384, 1))), {})
cnt: 24, ((T([64, 384, 192], f16), T([64, 384, 192], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 49, ((T([64, 196, 384], f16, stride=(75264, 1, 196)), [384], T([384], f16), T([384], f16), 1e-06), {})
Operator: aten.native_layer_norm_backward.default
cnt: 25, ((T([64, 196, 384], f16), T([64, 196, 384], f16, stride=(75264, 1, 196)), [384], T([64, 196, 1], f32), T([64, 196, 1], f32), T([384], f16), T([384], f16), [True, True, True]), {})
cnt: 24, ((T([64, 196, 384], f16, stride=(75264, 1, 196)), T([64, 196, 384], f16, stride=(75264, 1, 196)), [384], T([64, 196, 1], f32), T([64, 196, 1], f32), T([384], f16), T([384], f16), [True, True, True]), {})
Operator: aten.new_empty_strided.default
cnt: 24, ((T([384, 196], f16, stride=(1, 384)), [384, 196], [196, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([64, 1000], f16), T([64], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([64, 1000], f16), T([64], i64), None, 1, -100), {})
Operator: aten.silu.default
cnt: 24, ((T([64, 384, 192], f16, stride=(147456, 384, 1)),), {})
cnt: 24, ((T([64, 196, 768], f16, stride=(301056, 1536, 1)),), {})
Operator: aten.silu_backward.default
cnt: 24, ((T([64, 196, 768], f16), T([64, 196, 768], f16, stride=(301056, 1536, 1))), {})
cnt: 24, ((T([64, 384, 192], f16), T([64, 384, 192], f16, stride=(147456, 384, 1))), {})
Operator: aten.split.Tensor
cnt: 24, ((T([64, 384, 384], f16), 192, -1), {})
cnt: 24, ((T([64, 196, 1536], f16), 768, -1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([64, 1000], f16), [0], True), {})
cnt: 24, ((T([12544, 384], f16), [0], True), {})
cnt: 24, ((T([12544, 1536], f16), [0], True), {})
cnt: 24, ((T([24576, 196], f16), [0], True), {})
cnt: 24, ((T([64, 384, 384], f16), [0, 1], True), {})
cnt: 24, ((T([64, 196, 384], f16), [0], True), {})

View File

@ -0,0 +1,70 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([64, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16), 1, f16), {})
Operator: aten._unsafe_view.default
cnt: 30, ((T([64, 768, 196], f16), [64, 768, 196]), {})
Operator: aten.add.Tensor
cnt: 30, ((T([64, 768, 196], f16), T([196], f16)), {})
cnt: 30, ((T([64, 196, 256], f16, stride=(50176, 1, 196)), T([64, 196, 256], f16)), {})
cnt: 30, ((T([64, 196, 256], f16), T([64, 196, 256], f16)), {})
Operator: aten.addmm.default
cnt: 30, ((T([1536], f16), T([12544, 256], f16), T([256, 1536], f16, stride=(1, 256))), {})
cnt: 30, ((T([256], f16), T([12544, 768], f16), T([768, 256], f16, stride=(1, 768))), {})
cnt: 1, ((T([1000], f16), T([64, 256], f16), T([256, 1000], f16, stride=(1, 256))), {})
Operator: aten.bmm.default
cnt: 30, ((T([64, 768, 196], f16, stride=(150528, 1, 768)), T([64, 196, 196], f16, stride=(0, 1, 196))), {})
cnt: 30, ((T([64, 196, 768], f16), T([64, 768, 196], f16, stride=(150528, 1, 768))), {})
cnt: 30, ((T([64, 768, 196], f16, stride=(150528, 1, 768)), T([64, 196, 196], f16, stride=(0, 196, 1))), {})
Operator: aten.cat.default
cnt: 30, (([T([64, 196, 768], f16), T([64, 196, 768], f16, stride=(150528, 1, 196))], 2), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([256, 3, 16, 16], f16), T([256], f16), [16, 16], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([64, 256, 14, 14], f16, stride=(50176, 1, 3584, 256)), T([64, 3, 224, 224], f16), T([256, 3, 16, 16], f16), [256], [16, 16], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([64, 3, 224, 224], f16)), {})
cnt: 30, ((T([196, 196], f16), T([196, 196], f16, stride=(1, 196))), {})
Operator: aten.div.Scalar
cnt: 1, ((T([64, 196, 256], f16, stride=(256, 0, 1)), 196), {})
Operator: aten.gelu.default
cnt: 30, ((T([64, 196, 1536], f16),), {})
Operator: aten.gelu_backward.default
cnt: 30, ((T([64, 196, 1536], f16), T([64, 196, 1536], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([64], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([64, 196, 256], f16), [1]), {})
Operator: aten.mm.default
cnt: 1, ((T([64, 1000], f16), T([1000, 256], f16)), {})
cnt: 1, ((T([1000, 64], f16, stride=(1, 1000)), T([64, 256], f16)), {})
cnt: 30, ((T([12544, 256], f16), T([256, 768], f16)), {})
cnt: 30, ((T([256, 12544], f16, stride=(1, 256)), T([12544, 768], f16)), {})
cnt: 30, ((T([12544, 1536], f16), T([1536, 256], f16)), {})
cnt: 30, ((T([1536, 12544], f16, stride=(1, 1536)), T([12544, 256], f16)), {})
Operator: aten.mul.Tensor
cnt: 30, ((T([64, 196, 768], f16, stride=(301056, 1536, 1)), T([64, 196, 768], f16, stride=(150528, 1, 196))), {})
cnt: 30, ((T([64, 196, 768], f16), T([64, 196, 768], f16, stride=(301056, 1536, 1))), {})
cnt: 30, ((T([64, 196, 768], f16), T([64, 196, 768], f16, stride=(150528, 1, 196))), {})
Operator: aten.native_layer_norm.default
cnt: 31, ((T([64, 196, 256], f16, stride=(50176, 1, 196)), [256], T([256], f16), T([256], f16), 1e-06), {})
cnt: 30, ((T([64, 196, 768], f16, stride=(301056, 1536, 1)), [768], T([768], f16), T([768], f16), 1e-05), {})
Operator: aten.native_layer_norm_backward.default
cnt: 31, ((T([64, 196, 256], f16), T([64, 196, 256], f16, stride=(50176, 1, 196)), [256], T([64, 196, 1], f32), T([64, 196, 1], f32), T([256], f16), T([256], f16), [True, True, True]), {})
cnt: 30, ((T([64, 196, 768], f16, stride=(150528, 1, 196)), T([64, 196, 768], f16, stride=(301056, 1536, 1)), [768], T([64, 196, 1], f32), T([64, 196, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.new_empty_strided.default
cnt: 30, ((T([196, 196], f16, stride=(1, 196)), [196, 196], [196, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([64, 1000], f16), T([64], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([64, 1000], f16), T([64], i64), None, 1, -100), {})
Operator: aten.split.Tensor
cnt: 30, ((T([64, 196, 1536], f16), 768, -1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([64, 1000], f16), [0], True), {})
cnt: 30, ((T([12544, 256], f16), [0], True), {})
cnt: 30, ((T([64, 768, 196], f16, stride=(150528, 1, 768)), [0, 1], True), {})
cnt: 30, ((T([64, 196, 196], f16), [0], True), {})
cnt: 30, ((T([12544, 1536], f16), [0], True), {})

View File

@ -0,0 +1,260 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 34, ((T([], i64), 1), {})
cnt: 2, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16)), {})
cnt: 2, ((T([128, 40, 28, 28], f16), T([128, 40, 28, 28], f16)), {})
cnt: 2, ((T([128, 80, 14, 14], f16), T([128, 80, 14, 14], f16)), {})
cnt: 2, ((T([128, 112, 14, 14], f16), T([128, 112, 14, 14], f16)), {})
cnt: 2, ((T([128, 192, 7, 7], f16), T([128, 192, 7, 7], f16)), {})
cnt: 1, ((T([128, 1152, 7, 7], f16), T([128, 1152, 7, 7], f16)), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([128, 672, 7, 7], f16)), {})
cnt: 1, ((T([128, 672, 14, 14], f16), T([128, 672, 14, 14], f16)), {})
cnt: 2, ((T([128, 480, 14, 14], f16), T([128, 480, 14, 14], f16)), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([128, 240, 14, 14], f16)), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([128, 240, 28, 28], f16)), {})
cnt: 1, ((T([128, 72, 56, 56], f16), T([128, 72, 56, 56], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 1280], f16), T([1280, 1000], f16, stride=(1, 1280))), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 224, 224], f16),), {})
cnt: 1, ((T([128, 32, 112, 112], f16),), {})
cnt: 1, ((T([128, 240, 28, 28], f16),), {})
cnt: 1, ((T([128, 240, 14, 14], f16),), {})
cnt: 4, ((T([128, 480, 14, 14], f16),), {})
cnt: 3, ((T([128, 672, 14, 14], f16),), {})
cnt: 1, ((T([128, 672, 7, 7], f16),), {})
cnt: 2, ((T([128, 1152, 7, 7], f16),), {})
cnt: 1, ((T([128, 960, 7, 7], f16),), {})
cnt: 1, ((T([128, 1280, 1, 1], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([32, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 112, 112], f16), T([32, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 32), {})
cnt: 1, ((T([128, 32, 112, 112], f16), T([16, 32, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([48, 16, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 48, 112, 112], f16), T([48, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 48), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([24, 48, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 24, 56, 56], f16), T([72, 24, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 72, 56, 56], f16), T([72, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 72), {})
cnt: 1, ((T([128, 72, 1, 1], f16), T([24, 72, 1, 1], f16), T([24], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 24, 1, 1], f16), T([72, 24, 1, 1], f16), T([72], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 72, 56, 56], f16), T([24, 72, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 72, 56, 56], f16), T([72, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 72), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([40, 72, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 40, 28, 28], f16), T([240, 40, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([240, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 240), {})
cnt: 2, ((T([128, 240, 1, 1], f16), T([64, 240, 1, 1], f16), T([64], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 64, 1, 1], f16), T([240, 64, 1, 1], f16), T([240], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([40, 240, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([240, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 240), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([80, 240, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 80, 14, 14], f16), T([480, 80, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 480, 14, 14], f16), T([480, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 480), {})
cnt: 2, ((T([128, 480, 1, 1], f16), T([120, 480, 1, 1], f16), T([120], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 120, 1, 1], f16), T([480, 120, 1, 1], f16), T([480], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 480, 14, 14], f16), T([80, 480, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 480, 14, 14], f16), T([112, 480, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 112, 14, 14], f16), T([672, 112, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 672, 14, 14], f16), T([672, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 672), {})
cnt: 2, ((T([128, 672, 1, 1], f16), T([168, 672, 1, 1], f16), T([168], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 168, 1, 1], f16), T([672, 168, 1, 1], f16), T([672], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 672, 14, 14], f16), T([112, 672, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 672, 14, 14], f16), T([672, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 672), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([192, 672, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 7, 7], f16), T([1152, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1152, 7, 7], f16), T([1152, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 1152), {})
cnt: 1, ((T([128, 1152, 1, 1], f16), T([288, 1152, 1, 1], f16), T([288], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 288, 1, 1], f16), T([1152, 288, 1, 1], f16), T([1152], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1152, 7, 7], f16), T([192, 1152, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 7, 7], f16), T([960, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 960, 1, 1], f16), T([1280, 960, 1, 1], f16), T([1280], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 1280, 1, 1], f16), T([128, 960, 1, 1], f16), T([1280, 960, 1, 1], f16), [1280], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 960, 7, 7], f16), T([128, 192, 7, 7], f16), T([960, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 7, 7], f16), T([128, 1152, 7, 7], f16), T([192, 1152, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1152, 1, 1], f16), T([128, 288, 1, 1], f16), T([1152, 288, 1, 1], f16), [1152], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 288, 1, 1], f16), T([128, 1152, 1, 1], f16), T([288, 1152, 1, 1], f16), [288], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 1152, 7, 7], f16), T([128, 1152, 7, 7], f16), T([1152, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 1152, [True, True, False]), {})
cnt: 1, ((T([128, 1152, 7, 7], f16), T([128, 192, 7, 7], f16), T([1152, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 7, 7], f16), T([128, 672, 7, 7], f16), T([192, 672, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 672, 1, 1], f16), T([128, 168, 1, 1], f16), T([672, 168, 1, 1], f16), [672], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 168, 1, 1], f16), T([128, 672, 1, 1], f16), T([168, 672, 1, 1], f16), [168], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([128, 672, 14, 14], f16), T([672, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 672, [True, True, False]), {})
cnt: 2, ((T([128, 672, 14, 14], f16), T([128, 112, 14, 14], f16), T([672, 112, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 112, 14, 14], f16), T([128, 672, 14, 14], f16), T([112, 672, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 672, 14, 14], f16), T([128, 672, 14, 14], f16), T([672, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 672, [True, True, False]), {})
cnt: 1, ((T([128, 112, 14, 14], f16), T([128, 480, 14, 14], f16), T([112, 480, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 480, 1, 1], f16), T([128, 120, 1, 1], f16), T([480, 120, 1, 1], f16), [480], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 120, 1, 1], f16), T([128, 480, 1, 1], f16), T([120, 480, 1, 1], f16), [120], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 480, 14, 14], f16), T([128, 480, 14, 14], f16), T([480, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 480, [True, True, False]), {})
cnt: 2, ((T([128, 480, 14, 14], f16), T([128, 80, 14, 14], f16), T([480, 80, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 80, 14, 14], f16), T([128, 480, 14, 14], f16), T([80, 480, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 80, 14, 14], f16), T([128, 240, 14, 14], f16), T([80, 240, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 240, 1, 1], f16), T([128, 64, 1, 1], f16), T([240, 64, 1, 1], f16), [240], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 2, ((T([128, 64, 1, 1], f16), T([128, 240, 1, 1], f16), T([64, 240, 1, 1], f16), [64], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([128, 240, 28, 28], f16), T([240, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 240, [True, True, False]), {})
cnt: 2, ((T([128, 240, 28, 28], f16), T([128, 40, 28, 28], f16), T([240, 40, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 40, 28, 28], f16), T([128, 240, 28, 28], f16), T([40, 240, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([128, 240, 28, 28], f16), T([240, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 240, [True, True, False]), {})
cnt: 1, ((T([128, 40, 28, 28], f16), T([128, 72, 28, 28], f16), T([40, 72, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([128, 72, 56, 56], f16), T([72, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 72, [True, True, False]), {})
cnt: 2, ((T([128, 72, 56, 56], f16), T([128, 24, 56, 56], f16), T([72, 24, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 24, 56, 56], f16), T([128, 72, 56, 56], f16), T([24, 72, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 72, 1, 1], f16), T([128, 24, 1, 1], f16), T([72, 24, 1, 1], f16), [72], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 24, 1, 1], f16), T([128, 72, 1, 1], f16), T([24, 72, 1, 1], f16), [24], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 72, 56, 56], f16), T([128, 72, 56, 56], f16), T([72, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 72, [True, True, False]), {})
cnt: 1, ((T([128, 24, 56, 56], f16), T([128, 48, 56, 56], f16), T([24, 48, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([128, 48, 112, 112], f16), T([48, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 48, [True, True, False]), {})
cnt: 1, ((T([128, 48, 112, 112], f16), T([128, 16, 112, 112], f16), T([48, 16, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 32, 112, 112], f16), T([16, 32, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 112, 112], f16), T([128, 32, 112, 112], f16), T([32, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 32, [True, True, False]), {})
cnt: 1, ((T([128, 32, 112, 112], f16), T([128, 3, 224, 224], f16), T([32, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([128, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 960, 7, 7], f16, stride=(960, 1, 0, 0)), 49), {})
cnt: 1, ((T([128, 1152, 7, 7], f16, stride=(1152, 1, 0, 0)), 49), {})
cnt: 1, ((T([128, 672, 7, 7], f16, stride=(672, 1, 0, 0)), 49), {})
cnt: 1, ((T([128, 672, 14, 14], f16, stride=(672, 1, 0, 0)), 196), {})
cnt: 2, ((T([128, 480, 14, 14], f16, stride=(480, 1, 0, 0)), 196), {})
cnt: 1, ((T([128, 240, 14, 14], f16, stride=(240, 1, 0, 0)), 196), {})
cnt: 1, ((T([128, 240, 28, 28], f16, stride=(240, 1, 0, 0)), 784), {})
cnt: 1, ((T([128, 72, 56, 56], f16, stride=(72, 1, 0, 0)), 3136), {})
Operator: aten.hardsigmoid.default
cnt: 1, ((T([128, 72, 1, 1], f16),), {})
cnt: 2, ((T([128, 240, 1, 1], f16),), {})
cnt: 2, ((T([128, 480, 1, 1], f16),), {})
cnt: 2, ((T([128, 672, 1, 1], f16),), {})
cnt: 1, ((T([128, 1152, 1, 1], f16),), {})
Operator: aten.hardsigmoid_backward.default
cnt: 1, ((T([128, 1152, 1, 1], f16), T([128, 1152, 1, 1], f16)), {})
cnt: 2, ((T([128, 672, 1, 1], f16), T([128, 672, 1, 1], f16)), {})
cnt: 2, ((T([128, 480, 1, 1], f16), T([128, 480, 1, 1], f16)), {})
cnt: 2, ((T([128, 240, 1, 1], f16), T([128, 240, 1, 1], f16)), {})
cnt: 1, ((T([128, 72, 1, 1], f16), T([128, 72, 1, 1], f16)), {})
Operator: aten.hardswish_.default
cnt: 1, ((T([128, 32, 112, 112], f16),), {})
cnt: 1, ((T([128, 240, 28, 28], f16),), {})
cnt: 1, ((T([128, 240, 14, 14], f16),), {})
cnt: 4, ((T([128, 480, 14, 14], f16),), {})
cnt: 3, ((T([128, 672, 14, 14], f16),), {})
cnt: 1, ((T([128, 672, 7, 7], f16),), {})
cnt: 2, ((T([128, 1152, 7, 7], f16),), {})
cnt: 1, ((T([128, 960, 7, 7], f16),), {})
cnt: 1, ((T([128, 1280, 1, 1], f16),), {})
Operator: aten.hardswish_backward.default
cnt: 1, ((T([128, 1280, 1, 1], f16), T([128, 1280, 1, 1], f16)), {})
cnt: 1, ((T([128, 960, 7, 7], f16), T([128, 960, 7, 7], f16)), {})
cnt: 2, ((T([128, 1152, 7, 7], f16), T([128, 1152, 7, 7], f16)), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([128, 672, 7, 7], f16)), {})
cnt: 3, ((T([128, 672, 14, 14], f16), T([128, 672, 14, 14], f16)), {})
cnt: 4, ((T([128, 480, 14, 14], f16), T([128, 480, 14, 14], f16)), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([128, 240, 14, 14], f16)), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([128, 240, 28, 28], f16)), {})
cnt: 1, ((T([128, 32, 112, 112], f16), T([128, 32, 112, 112], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 72, 56, 56], f16), [2, 3], True), {})
cnt: 1, ((T([128, 240, 28, 28], f16), [2, 3], True), {})
cnt: 1, ((T([128, 240, 14, 14], f16), [2, 3], True), {})
cnt: 2, ((T([128, 480, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([128, 672, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([128, 672, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 1152, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 960, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 1280], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 1280], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([128, 72, 56, 56], f16), T([128, 72, 1, 1], f16)), {})
cnt: 2, ((T([128, 240, 28, 28], f16), T([128, 240, 1, 1], f16)), {})
cnt: 2, ((T([128, 240, 14, 14], f16), T([128, 240, 1, 1], f16)), {})
cnt: 4, ((T([128, 480, 14, 14], f16), T([128, 480, 1, 1], f16)), {})
cnt: 2, ((T([128, 672, 14, 14], f16), T([128, 672, 1, 1], f16)), {})
cnt: 2, ((T([128, 672, 7, 7], f16), T([128, 672, 1, 1], f16)), {})
cnt: 2, ((T([128, 1152, 7, 7], f16), T([128, 1152, 1, 1], f16)), {})
cnt: 1, ((T([128, 1152, 7, 7], f16), T([128, 1152, 7, 7], f16)), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([128, 672, 7, 7], f16)), {})
cnt: 1, ((T([128, 672, 14, 14], f16), T([128, 672, 14, 14], f16)), {})
cnt: 2, ((T([128, 480, 14, 14], f16), T([128, 480, 14, 14], f16)), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([128, 240, 14, 14], f16)), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([128, 240, 28, 28], f16)), {})
cnt: 1, ((T([128, 72, 56, 56], f16), T([128, 72, 56, 56], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 2, ((T([128, 32, 112, 112], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 48, 112, 112], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 24, 56, 56], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 72, 56, 56], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 40, 28, 28], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 240, 28, 28], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 80, 14, 14], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 480, 14, 14], f16), T([480], f16), T([480], f16), T([480], f16), T([480], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 112, 14, 14], f16), T([112], f16), T([112], f16), T([112], f16), T([112], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 672, 14, 14], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 192, 7, 7], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 1152, 7, 7], f16), T([1152], f16), T([1152], f16), T([1152], f16), T([1152], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 960, 7, 7], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([128, 960, 7, 7], f16), T([128, 960, 7, 7], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f32), T([960], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 192, 7, 7], f16), T([128, 192, 7, 7], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 1152, 7, 7], f16), T([128, 1152, 7, 7], f16), T([1152], f16), T([1152], f16), T([1152], f16), T([1152], f32), T([1152], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 672, 7, 7], f16), T([128, 672, 7, 7], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f32), T([672], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 672, 14, 14], f16), T([128, 672, 14, 14], f16), T([672], f16), T([672], f16), T([672], f16), T([672], f32), T([672], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 112, 14, 14], f16), T([128, 112, 14, 14], f16), T([112], f16), T([112], f16), T([112], f16), T([112], f32), T([112], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 480, 14, 14], f16), T([128, 480, 14, 14], f16), T([480], f16), T([480], f16), T([480], f16), T([480], f32), T([480], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 80, 14, 14], f16), T([128, 80, 14, 14], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f32), T([80], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([128, 240, 14, 14], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f32), T([240], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 240, 28, 28], f16), T([128, 240, 28, 28], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f32), T([240], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 40, 28, 28], f16), T([128, 40, 28, 28], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f32), T([40], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([128, 72, 28, 28], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f32), T([72], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 72, 56, 56], f16), T([128, 72, 56, 56], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f32), T([72], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f32), T([24], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([128, 48, 56, 56], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f32), T([48], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 48, 112, 112], f16), T([128, 48, 112, 112], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f32), T([48], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f32), T([16], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 32, 112, 112], f16), T([128, 32, 112, 112], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 1, ((T([128, 32, 112, 112], f16),), {})
cnt: 1, ((T([128, 48, 112, 112], f16),), {})
cnt: 1, ((T([128, 48, 56, 56], f16),), {})
cnt: 3, ((T([128, 72, 56, 56], f16),), {})
cnt: 1, ((T([128, 24, 1, 1], f16),), {})
cnt: 1, ((T([128, 72, 28, 28], f16),), {})
cnt: 2, ((T([128, 240, 28, 28], f16),), {})
cnt: 2, ((T([128, 64, 1, 1], f16),), {})
cnt: 2, ((T([128, 120, 1, 1], f16),), {})
cnt: 2, ((T([128, 168, 1, 1], f16),), {})
cnt: 1, ((T([128, 288, 1, 1], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
cnt: 1, ((T([128, 1152, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 672, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 672, 14, 14], f16), [2, 3], True), {})
cnt: 2, ((T([128, 480, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([128, 240, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([128, 240, 28, 28], f16), [2, 3], True), {})
cnt: 1, ((T([128, 72, 56, 56], f16), [2, 3], True), {})
Operator: aten.threshold_backward.default
cnt: 1, ((T([128, 288, 1, 1], f16), T([128, 288, 1, 1], f16), 0), {})
cnt: 2, ((T([128, 168, 1, 1], f16), T([128, 168, 1, 1], f16), 0), {})
cnt: 2, ((T([128, 120, 1, 1], f16), T([128, 120, 1, 1], f16), 0), {})
cnt: 2, ((T([128, 64, 1, 1], f16), T([128, 64, 1, 1], f16), 0), {})
cnt: 2, ((T([128, 240, 28, 28], f16), T([128, 240, 28, 28], f16), 0), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([128, 72, 28, 28], f16), 0), {})
cnt: 3, ((T([128, 72, 56, 56], f16), T([128, 72, 56, 56], f16), 0), {})
cnt: 1, ((T([128, 24, 1, 1], f16), T([128, 24, 1, 1], f16), 0), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([128, 48, 56, 56], f16), 0), {})
cnt: 1, ((T([128, 48, 112, 112], f16), T([128, 48, 112, 112], f16), 0), {})
cnt: 1, ((T([128, 32, 112, 112], f16), T([128, 32, 112, 112], f16), 0), {})

View File

@ -0,0 +1,247 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 69, ((T([128, 18, 56, 56], f16), T([128, 18, 56, 56], f16)), {})
cnt: 70, ((T([128, 36, 28, 28], f16), T([128, 36, 28, 28], f16)), {})
cnt: 64, ((T([128, 72, 14, 14], f16), T([128, 72, 14, 14], f16)), {})
cnt: 31, ((T([128, 144, 7, 7], f16), T([128, 144, 7, 7], f16)), {})
cnt: 1, ((T([128, 256, 28, 28], f16), T([128, 256, 28, 28], f16)), {})
cnt: 1, ((T([128, 512, 14, 14], f16), T([128, 512, 14, 14], f16)), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), T([128, 1024, 7, 7], f16)), {})
cnt: 4, ((T([128, 256, 56, 56], f16), T([128, 256, 56, 56], f16)), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([128, 64, 56, 56], f16)), {})
Operator: aten.add_.Tensor
cnt: 325, ((T([], i64), 1), {})
cnt: 4, ((T([128, 256, 56, 56], f16), T([128, 256, 56, 56], f16)), {})
cnt: 32, ((T([128, 18, 56, 56], f16), T([128, 18, 56, 56], f16)), {})
cnt: 32, ((T([128, 36, 28, 28], f16), T([128, 36, 28, 28], f16)), {})
cnt: 28, ((T([128, 72, 14, 14], f16), T([128, 72, 14, 14], f16)), {})
cnt: 12, ((T([128, 144, 7, 7], f16), T([128, 144, 7, 7], f16)), {})
cnt: 1, ((T([128, 128, 56, 56], f16), T([128, 128, 56, 56], f16)), {})
cnt: 1, ((T([128, 256, 28, 28], f16), T([128, 256, 28, 28], f16)), {})
cnt: 1, ((T([128, 512, 14, 14], f16), T([128, 512, 14, 14], f16)), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), T([128, 1024, 7, 7], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 2048], f16), T([2048, 1000], f16, stride=(1, 2048))), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([64, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([64, 64, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([64, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 64, 56, 56], f16), T([64, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 64, 56, 56], f16), T([256, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 256, 56, 56], f16), T([64, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 56, 56], f16), T([18, 256, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 56, 56], f16), T([36, 256, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 64, ((T([128, 18, 56, 56], f16), T([18, 18, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 64, ((T([128, 36, 28, 28], f16), T([36, 36, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([128, 36, 28, 28], f16), T([18, 36, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([128, 18, 56, 56], f16), T([36, 18, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([128, 36, 28, 28], f16), T([72, 36, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 56, ((T([128, 72, 14, 14], f16), T([72, 72, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 7, ((T([128, 72, 14, 14], f16), T([18, 72, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 7, ((T([128, 72, 14, 14], f16), T([36, 72, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 10, ((T([128, 18, 56, 56], f16), T([18, 18, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 7, ((T([128, 18, 28, 28], f16), T([72, 18, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 72, 14, 14], f16), T([144, 72, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 24, ((T([128, 144, 7, 7], f16), T([144, 144, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 144, 7, 7], f16), T([18, 144, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 144, 7, 7], f16), T([36, 144, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 144, 7, 7], f16), T([72, 144, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 18, 28, 28], f16), T([18, 18, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 18, 14, 14], f16), T([144, 18, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 36, 28, 28], f16), T([36, 36, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 36, 14, 14], f16), T([144, 36, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 18, 56, 56], f16), T([32, 18, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([32, 32, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([128, 32, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 18, 56, 56], f16), T([128, 18, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 36, 28, 28], f16), T([64, 36, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([64, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([256, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 36, 28, 28], f16), T([256, 36, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 56, 56], f16), T([256, 128, 3, 3], f16), T([256], f16), [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 72, 14, 14], f16), T([128, 72, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 14, 14], f16), T([128, 128, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 14, 14], f16), T([512, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 72, 14, 14], f16), T([512, 72, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 28, 28], f16), T([512, 256, 3, 3], f16), T([512], f16), [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 144, 7, 7], f16), T([256, 144, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 7, 7], f16), T([256, 256, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 7, 7], f16), T([1024, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 144, 7, 7], f16), T([1024, 144, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 512, 14, 14], f16), T([1024, 512, 3, 3], f16), T([1024], f16), [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), T([2048, 1024, 1, 1], f16), T([2048], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 2048, 7, 7], f16), T([128, 1024, 7, 7], f16), T([2048, 1024, 1, 1], f16), [2048], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), T([128, 512, 14, 14], f16), T([1024, 512, 3, 3], f16), [1024], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), T([128, 144, 7, 7], f16), T([1024, 144, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1024, 7, 7], f16), T([128, 256, 7, 7], f16), T([1024, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 7, 7], f16), T([128, 256, 7, 7], f16), T([256, 256, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 7, 7], f16), T([128, 144, 7, 7], f16), T([256, 144, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 512, 14, 14], f16), T([128, 256, 28, 28], f16), T([512, 256, 3, 3], f16), [512], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 512, 14, 14], f16), T([128, 72, 14, 14], f16), T([512, 72, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 512, 14, 14], f16), T([128, 128, 14, 14], f16), T([512, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 14, 14], f16), T([128, 128, 14, 14], f16), T([128, 128, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 14, 14], f16), T([128, 72, 14, 14], f16), T([128, 72, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 28, 28], f16), T([128, 128, 56, 56], f16), T([256, 128, 3, 3], f16), [256], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 256, 28, 28], f16), T([128, 36, 28, 28], f16), T([256, 36, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 28, 28], f16), T([128, 64, 28, 28], f16), T([256, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([128, 64, 28, 28], f16), T([64, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([128, 36, 28, 28], f16), T([64, 36, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 56, 56], f16), T([128, 18, 56, 56], f16), T([128, 18, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 56, 56], f16), T([128, 32, 56, 56], f16), T([128, 32, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([128, 32, 56, 56], f16), T([32, 32, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([128, 18, 56, 56], f16), T([32, 18, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 144, 7, 7], f16), T([128, 72, 14, 14], f16), T([144, 72, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 144, 7, 7], f16), T([128, 36, 14, 14], f16), T([144, 36, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 36, 14, 14], f16), T([128, 36, 28, 28], f16), T([36, 36, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 144, 7, 7], f16), T([128, 18, 14, 14], f16), T([144, 18, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 18, 14, 14], f16), T([128, 18, 28, 28], f16), T([18, 18, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 10, ((T([128, 18, 28, 28], f16), T([128, 18, 56, 56], f16), T([18, 18, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 72, 7, 7], f16), T([128, 144, 7, 7], f16), T([72, 144, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 8, ((T([128, 72, 14, 14], f16), T([128, 36, 28, 28], f16), T([72, 36, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 7, ((T([128, 72, 14, 14], f16), T([128, 18, 28, 28], f16), T([72, 18, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 36, 7, 7], f16), T([128, 144, 7, 7], f16), T([36, 144, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 7, ((T([128, 36, 14, 14], f16), T([128, 72, 14, 14], f16), T([36, 72, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 8, ((T([128, 36, 28, 28], f16), T([128, 18, 56, 56], f16), T([36, 18, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 18, 7, 7], f16), T([128, 144, 7, 7], f16), T([18, 144, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 7, ((T([128, 18, 14, 14], f16), T([128, 72, 14, 14], f16), T([18, 72, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 8, ((T([128, 18, 28, 28], f16), T([128, 36, 28, 28], f16), T([18, 36, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 24, ((T([128, 144, 7, 7], f16), T([128, 144, 7, 7], f16), T([144, 144, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 56, ((T([128, 72, 14, 14], f16), T([128, 72, 14, 14], f16), T([72, 72, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 64, ((T([128, 36, 28, 28], f16), T([128, 36, 28, 28], f16), T([36, 36, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 64, ((T([128, 18, 56, 56], f16), T([128, 18, 56, 56], f16), T([18, 18, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 36, 28, 28], f16), T([128, 256, 56, 56], f16), T([36, 256, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 18, 56, 56], f16), T([128, 256, 56, 56], f16), T([18, 256, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 5, ((T([128, 256, 56, 56], f16), T([128, 64, 56, 56], f16), T([256, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 64, 56, 56], f16), T([128, 64, 56, 56], f16), T([64, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 64, 56, 56], f16), T([128, 256, 56, 56], f16), T([64, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([128, 64, 56, 56], f16), T([64, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 56, 56], f16), T([128, 64, 112, 112], f16), T([64, 64, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([128, 3, 224, 224], f16), T([64, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([128, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 2048, 7, 7], f16, stride=(2048, 1, 0, 0)), 49), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 2048, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 2048], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 2048], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([128, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 9, ((T([128, 64, 56, 56], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 5, ((T([128, 256, 56, 56], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 65, ((T([128, 18, 56, 56], f16), T([18], f16), T([18], f16), T([18], f16), T([18], f16), True, 0.1, 1e-05), {})
cnt: 73, ((T([128, 36, 28, 28], f16), T([36], f16), T([36], f16), T([36], f16), T([36], f16), True, 0.1, 1e-05), {})
cnt: 18, ((T([128, 18, 28, 28], f16), T([18], f16), T([18], f16), T([18], f16), T([18], f16), True, 0.1, 1e-05), {})
cnt: 71, ((T([128, 72, 14, 14], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f16), True, 0.1, 1e-05), {})
cnt: 10, ((T([128, 18, 14, 14], f16), T([18], f16), T([18], f16), T([18], f16), T([18], f16), True, 0.1, 1e-05), {})
cnt: 10, ((T([128, 36, 14, 14], f16), T([36], f16), T([36], f16), T([36], f16), T([36], f16), True, 0.1, 1e-05), {})
cnt: 34, ((T([128, 144, 7, 7], f16), T([144], f16), T([144], f16), T([144], f16), T([144], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 18, 7, 7], f16), T([18], f16), T([18], f16), T([18], f16), T([18], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 36, 7, 7], f16), T([36], f16), T([36], f16), T([36], f16), T([36], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 72, 7, 7], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 32, 56, 56], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 64, 28, 28], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 256, 28, 28], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 128, 14, 14], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 512, 14, 14], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 256, 7, 7], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 1024, 7, 7], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 2048, 7, 7], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([128, 2048, 7, 7], f16), T([128, 2048, 7, 7], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f32), T([2048], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 1024, 7, 7], f16), T([128, 1024, 7, 7], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 256, 7, 7], f16), T([128, 256, 7, 7], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 512, 14, 14], f16), T([128, 512, 14, 14], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 128, 14, 14], f16), T([128, 128, 14, 14], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 256, 28, 28], f16), T([128, 256, 28, 28], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 64, 28, 28], f16), T([128, 64, 28, 28], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 128, 56, 56], f16), T([128, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 32, 56, 56], f16), T([128, 32, 56, 56], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
cnt: 34, ((T([128, 144, 7, 7], f16), T([128, 144, 7, 7], f16), T([144], f16), T([144], f16), T([144], f16), T([144], f32), T([144], f32), True, 1e-05, [True, True, True]), {})
cnt: 10, ((T([128, 36, 14, 14], f16), T([128, 36, 14, 14], f16), T([36], f16), T([36], f16), T([36], f16), T([36], f32), T([36], f32), True, 1e-05, [True, True, True]), {})
cnt: 10, ((T([128, 18, 14, 14], f16), T([128, 18, 14, 14], f16), T([18], f16), T([18], f16), T([18], f16), T([18], f32), T([18], f32), True, 1e-05, [True, True, True]), {})
cnt: 18, ((T([128, 18, 28, 28], f16), T([128, 18, 28, 28], f16), T([18], f16), T([18], f16), T([18], f16), T([18], f32), T([18], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 72, 7, 7], f16), T([128, 72, 7, 7], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f32), T([72], f32), True, 1e-05, [True, True, True]), {})
cnt: 71, ((T([128, 72, 14, 14], f16), T([128, 72, 14, 14], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f32), T([72], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 36, 7, 7], f16), T([128, 36, 7, 7], f16), T([36], f16), T([36], f16), T([36], f16), T([36], f32), T([36], f32), True, 1e-05, [True, True, True]), {})
cnt: 73, ((T([128, 36, 28, 28], f16), T([128, 36, 28, 28], f16), T([36], f16), T([36], f16), T([36], f16), T([36], f32), T([36], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 18, 7, 7], f16), T([128, 18, 7, 7], f16), T([18], f16), T([18], f16), T([18], f16), T([18], f32), T([18], f32), True, 1e-05, [True, True, True]), {})
cnt: 65, ((T([128, 18, 56, 56], f16), T([128, 18, 56, 56], f16), T([18], f16), T([18], f16), T([18], f16), T([18], f32), T([18], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([128, 256, 56, 56], f16), T([128, 256, 56, 56], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 9, ((T([128, 64, 56, 56], f16), T([128, 64, 56, 56], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([128, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu.default
cnt: 8, ((T([128, 18, 56, 56], f16),), {})
cnt: 8, ((T([128, 36, 28, 28], f16),), {})
cnt: 10, ((T([128, 18, 28, 28], f16),), {})
cnt: 7, ((T([128, 72, 14, 14], f16),), {})
cnt: 3, ((T([128, 18, 14, 14], f16),), {})
cnt: 3, ((T([128, 36, 14, 14], f16),), {})
cnt: 3, ((T([128, 144, 7, 7], f16),), {})
Operator: aten.relu_.default
cnt: 1, ((T([128, 64, 112, 112], f16),), {})
cnt: 9, ((T([128, 64, 56, 56], f16),), {})
cnt: 4, ((T([128, 256, 56, 56], f16),), {})
cnt: 65, ((T([128, 18, 56, 56], f16),), {})
cnt: 65, ((T([128, 36, 28, 28], f16),), {})
cnt: 57, ((T([128, 72, 14, 14], f16),), {})
cnt: 25, ((T([128, 144, 7, 7], f16),), {})
cnt: 2, ((T([128, 32, 56, 56], f16),), {})
cnt: 1, ((T([128, 128, 56, 56], f16),), {})
cnt: 2, ((T([128, 64, 28, 28], f16),), {})
cnt: 2, ((T([128, 256, 28, 28], f16),), {})
cnt: 2, ((T([128, 128, 14, 14], f16),), {})
cnt: 2, ((T([128, 512, 14, 14], f16),), {})
cnt: 2, ((T([128, 256, 7, 7], f16),), {})
cnt: 2, ((T([128, 1024, 7, 7], f16),), {})
cnt: 1, ((T([128, 2048, 7, 7], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 1, ((T([128, 2048, 7, 7], f16), T([128, 2048, 7, 7], f16), 0), {})
cnt: 2, ((T([128, 1024, 7, 7], f16), T([128, 1024, 7, 7], f16), 0), {})
cnt: 2, ((T([128, 256, 7, 7], f16), T([128, 256, 7, 7], f16), 0), {})
cnt: 2, ((T([128, 512, 14, 14], f16), T([128, 512, 14, 14], f16), 0), {})
cnt: 2, ((T([128, 128, 14, 14], f16), T([128, 128, 14, 14], f16), 0), {})
cnt: 2, ((T([128, 256, 28, 28], f16), T([128, 256, 28, 28], f16), 0), {})
cnt: 2, ((T([128, 64, 28, 28], f16), T([128, 64, 28, 28], f16), 0), {})
cnt: 1, ((T([128, 128, 56, 56], f16), T([128, 128, 56, 56], f16), 0), {})
cnt: 2, ((T([128, 32, 56, 56], f16), T([128, 32, 56, 56], f16), 0), {})
cnt: 28, ((T([128, 144, 7, 7], f16), T([128, 144, 7, 7], f16), 0), {})
cnt: 3, ((T([128, 36, 14, 14], f16), T([128, 36, 14, 14], f16), 0), {})
cnt: 3, ((T([128, 18, 14, 14], f16), T([128, 18, 14, 14], f16), 0), {})
cnt: 10, ((T([128, 18, 28, 28], f16), T([128, 18, 28, 28], f16), 0), {})
cnt: 64, ((T([128, 72, 14, 14], f16), T([128, 72, 14, 14], f16), 0), {})
cnt: 73, ((T([128, 36, 28, 28], f16), T([128, 36, 28, 28], f16), 0), {})
cnt: 73, ((T([128, 18, 56, 56], f16), T([128, 18, 56, 56], f16), 0), {})
cnt: 4, ((T([128, 256, 56, 56], f16), T([128, 256, 56, 56], f16), 0), {})
cnt: 9, ((T([128, 64, 56, 56], f16), T([128, 64, 56, 56], f16), 0), {})
cnt: 1, ((T([128, 64, 112, 112], f16), T([128, 64, 112, 112], f16), 0), {})
Operator: aten.upsample_nearest2d.vec
cnt: 8, ((T([128, 18, 28, 28], f16), None, [2.0, 2.0]), {})
cnt: 7, ((T([128, 18, 14, 14], f16), None, [4.0, 4.0]), {})
cnt: 7, ((T([128, 36, 14, 14], f16), None, [2.0, 2.0]), {})
cnt: 3, ((T([128, 18, 7, 7], f16), None, [8.0, 8.0]), {})
cnt: 3, ((T([128, 36, 7, 7], f16), None, [4.0, 4.0]), {})
cnt: 3, ((T([128, 72, 7, 7], f16), None, [2.0, 2.0]), {})
Operator: aten.upsample_nearest2d_backward.vec
cnt: 3, ((T([128, 72, 14, 14], f16), None, [128, 72, 7, 7], [2.0, 2.0]), {})
cnt: 3, ((T([128, 36, 28, 28], f16), None, [128, 36, 7, 7], [4.0, 4.0]), {})
cnt: 7, ((T([128, 36, 28, 28], f16), None, [128, 36, 14, 14], [2.0, 2.0]), {})
cnt: 3, ((T([128, 18, 56, 56], f16), None, [128, 18, 7, 7], [8.0, 8.0]), {})
cnt: 7, ((T([128, 18, 56, 56], f16), None, [128, 18, 14, 14], [4.0, 4.0]), {})
cnt: 8, ((T([128, 18, 56, 56], f16), None, [128, 18, 28, 28], [2.0, 2.0]), {})

View File

@ -0,0 +1,239 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 4, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16)), {})
cnt: 3, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16)), {})
cnt: 3, ((T([128, 1280, 8, 8], f16), T([128, 1280, 8, 8], f16)), {})
cnt: 14, ((T([128, 768, 17, 17], f16), T([128, 768, 17, 17], f16)), {})
cnt: 5, ((T([128, 288, 35, 35], f16), T([128, 288, 35, 35], f16)), {})
cnt: 3, ((T([128, 256, 35, 35], f16), T([128, 256, 35, 35], f16)), {})
cnt: 3, ((T([128, 192, 35, 35], f16), T([128, 192, 35, 35], f16)), {})
Operator: aten.add_.Tensor
cnt: 94, ((T([], i64), 1), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 2048], f16), T([2048, 1000], f16, stride=(1, 2048))), {})
Operator: aten.avg_pool2d.default
cnt: 1, ((T([128, 192, 35, 35], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 1, ((T([128, 256, 35, 35], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 1, ((T([128, 288, 35, 35], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 4, ((T([128, 768, 17, 17], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), [3, 3], [1, 1], [1, 1]), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), [3, 3], [1, 1], [1, 1]), {})
Operator: aten.avg_pool2d_backward.default
cnt: 1, ((T([128, 2048, 8, 8], f16), T([128, 2048, 8, 8], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([128, 1280, 8, 8], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 4, ((T([128, 768, 17, 17], f16), T([128, 768, 17, 17], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 1, ((T([128, 288, 35, 35], f16), T([128, 288, 35, 35], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 1, ((T([128, 256, 35, 35], f16), T([128, 256, 35, 35], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
cnt: 1, ((T([128, 192, 35, 35], f16), T([128, 192, 35, 35], f16), [3, 3], [1, 1], [1, 1], False, True, None), {})
Operator: aten.cat.default
cnt: 1, (([T([128, 64, 35, 35], f16), T([128, 64, 35, 35], f16), T([128, 96, 35, 35], f16), T([128, 32, 35, 35], f16)], 1), {})
cnt: 2, (([T([128, 64, 35, 35], f16), T([128, 64, 35, 35], f16), T([128, 96, 35, 35], f16), T([128, 64, 35, 35], f16)], 1), {})
cnt: 1, (([T([128, 384, 17, 17], f16), T([128, 96, 17, 17], f16), T([128, 288, 17, 17], f16)], 1), {})
cnt: 4, (([T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16)], 1), {})
cnt: 1, (([T([128, 320, 8, 8], f16), T([128, 192, 8, 8], f16), T([128, 768, 8, 8], f16)], 1), {})
cnt: 4, (([T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16)], 1), {})
cnt: 2, (([T([128, 320, 8, 8], f16), T([128, 768, 8, 8], f16), T([128, 768, 8, 8], f16), T([128, 192, 8, 8], f16)], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 299, 299], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 299, 299], f16), T([32, 3, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 149, 149], f16), T([32, 32, 3, 3], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([64, 32, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 73, 73], f16), T([80, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([192, 80, 3, 3], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 192, 35, 35], f16), T([64, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 35, 35], f16), T([48, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 48, 35, 35], f16), T([64, 48, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 64, 35, 35], f16), T([96, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 96, 35, 35], f16), T([96, 96, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 35, 35], f16), T([32, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 256, 35, 35], f16), T([64, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 35, 35], f16), T([48, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 288, 35, 35], f16), T([64, 288, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 288, 35, 35], f16), T([48, 288, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 288, 35, 35], f16), T([384, 288, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 96, 35, 35], f16), T([96, 96, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 12, ((T([128, 768, 17, 17], f16), T([192, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 768, 17, 17], f16), T([128, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 128, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 17, 17], f16), T([192, 128, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 128, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 17, 17], f16), T([192, 128, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 768, 17, 17], f16), T([160, 768, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([160, 160, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 160, 17, 17], f16), T([192, 160, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([160, 160, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 160, 17, 17], f16), T([192, 160, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 192, 17, 17], f16), T([192, 192, 1, 7], f16), None, [1, 1], [0, 3], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 192, 17, 17], f16), T([192, 192, 7, 1], f16), None, [1, 1], [3, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 17, 17], f16), T([320, 192, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 192, 17, 17], f16), T([192, 192, 3, 3], f16), None, [2, 2], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([320, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([384, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([384, 384, 1, 3], f16), None, [1, 1], [0, 1], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([384, 384, 3, 1], f16), None, [1, 1], [1, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([448, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 448, 8, 8], f16), T([384, 448, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1280, 8, 8], f16), T([192, 1280, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([320, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([384, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([448, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 2048, 8, 8], f16), T([192, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 192, 8, 8], f16), T([128, 2048, 8, 8], f16), T([192, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16), T([384, 384, 3, 1], f16), [0], [1, 1], [1, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16), T([384, 384, 1, 3], f16), [0], [1, 1], [0, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 384, 8, 8], f16), T([128, 448, 8, 8], f16), T([384, 448, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 448, 8, 8], f16), T([128, 2048, 8, 8], f16), T([448, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 384, 8, 8], f16), T([128, 2048, 8, 8], f16), T([384, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 320, 8, 8], f16), T([128, 2048, 8, 8], f16), T([320, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 8, 8], f16), T([128, 1280, 8, 8], f16), T([192, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 448, 8, 8], f16), T([128, 1280, 8, 8], f16), T([448, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 384, 8, 8], f16), T([128, 1280, 8, 8], f16), T([384, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 320, 8, 8], f16), T([128, 1280, 8, 8], f16), T([320, 1280, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 8, 8], f16), T([128, 192, 17, 17], f16), T([192, 192, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([192, 192, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([192, 192, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 12, ((T([128, 192, 17, 17], f16), T([128, 768, 17, 17], f16), T([192, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 320, 8, 8], f16), T([128, 192, 17, 17], f16), T([320, 192, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 192, 17, 17], f16), T([128, 160, 17, 17], f16), T([192, 160, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([128, 160, 17, 17], f16), T([160, 160, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([128, 160, 17, 17], f16), T([160, 160, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 160, 17, 17], f16), T([128, 768, 17, 17], f16), T([160, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 192, 17, 17], f16), T([128, 160, 17, 17], f16), T([192, 160, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 17, 17], f16), T([128, 128, 17, 17], f16), T([192, 128, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 128, 17, 17], f16), T([128, 128, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 128, 17, 17], f16), T([128, 128, 1, 7], f16), [0], [1, 1], [0, 3], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 128, 17, 17], f16), T([128, 768, 17, 17], f16), T([128, 768, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 17, 17], f16), T([128, 128, 17, 17], f16), T([192, 128, 7, 1], f16), [0], [1, 1], [3, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 96, 17, 17], f16), T([128, 96, 35, 35], f16), T([96, 96, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 96, 35, 35], f16), T([128, 64, 35, 35], f16), T([96, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 4, ((T([128, 64, 35, 35], f16), T([128, 288, 35, 35], f16), T([64, 288, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 384, 17, 17], f16), T([128, 288, 35, 35], f16), T([384, 288, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 96, 35, 35], f16), T([128, 96, 35, 35], f16), T([96, 96, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 64, 35, 35], f16), T([128, 48, 35, 35], f16), T([64, 48, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 48, 35, 35], f16), T([128, 288, 35, 35], f16), T([48, 288, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 64, 35, 35], f16), T([128, 256, 35, 35], f16), T([64, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 48, 35, 35], f16), T([128, 256, 35, 35], f16), T([48, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 35, 35], f16), T([128, 192, 35, 35], f16), T([32, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 64, 35, 35], f16), T([128, 192, 35, 35], f16), T([64, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 48, 35, 35], f16), T([128, 192, 35, 35], f16), T([48, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 192, 71, 71], f16), T([128, 80, 73, 73], f16), T([192, 80, 3, 3], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([128, 64, 73, 73], f16), T([80, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 147, 147], f16), T([128, 32, 147, 147], f16), T([64, 32, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([128, 32, 149, 149], f16), T([32, 32, 3, 3], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 149, 149], f16), T([128, 3, 299, 299], f16), T([32, 3, 3, 3], f16), [0], [2, 2], [0, 0], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 299, 299], f16), T([128, 3, 299, 299], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 2048, 8, 8], f16, stride=(2048, 1, 0, 0)), 64), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([128, 64, 147, 147], f16), [3, 3], [2, 2]), {})
cnt: 1, ((T([128, 192, 71, 71], f16), [3, 3], [2, 2]), {})
cnt: 1, ((T([128, 288, 35, 35], f16), [3, 3], [2, 2]), {})
cnt: 1, ((T([128, 768, 17, 17], f16), [3, 3], [2, 2]), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([128, 768, 8, 8], f16, stride=(81920, 64, 8, 1)), T([128, 768, 17, 17], f16), [3, 3], [2, 2], [0, 0], [1, 1], False, T([128, 768, 8, 8], i64)), {})
cnt: 1, ((T([128, 288, 17, 17], f16, stride=(221952, 289, 17, 1)), T([128, 288, 35, 35], f16), [3, 3], [2, 2], [0, 0], [1, 1], False, T([128, 288, 17, 17], i64)), {})
cnt: 1, ((T([128, 192, 35, 35], f16), T([128, 192, 71, 71], f16), [3, 3], [2, 2], [0, 0], [1, 1], False, T([128, 192, 35, 35], i64)), {})
cnt: 1, ((T([128, 64, 73, 73], f16), T([128, 64, 147, 147], f16), [3, 3], [2, 2], [0, 0], [1, 1], False, T([128, 64, 73, 73], i64)), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 2048, 8, 8], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 2048], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 2048], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([128, 32, 149, 149], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 64, 147, 147], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 192, 71, 71], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 0.001), {})
cnt: 12, ((T([128, 64, 35, 35], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 0.001), {})
cnt: 3, ((T([128, 48, 35, 35], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f16), True, 0.1, 0.001), {})
cnt: 7, ((T([128, 96, 35, 35], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 32, 35, 35], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 384, 17, 17], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f16), True, 0.1, 0.001), {})
cnt: 1, ((T([128, 96, 17, 17], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f16), True, 0.1, 0.001), {})
cnt: 26, ((T([128, 192, 17, 17], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 0.001), {})
cnt: 6, ((T([128, 128, 17, 17], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 0.001), {})
cnt: 12, ((T([128, 160, 17, 17], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f16), True, 0.1, 0.001), {})
cnt: 3, ((T([128, 320, 8, 8], f16), T([320], f16), T([320], f16), T([320], f16), T([320], f16), True, 0.1, 0.001), {})
cnt: 3, ((T([128, 192, 8, 8], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 0.001), {})
cnt: 12, ((T([128, 384, 8, 8], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f16), True, 0.1, 0.001), {})
cnt: 2, ((T([128, 448, 8, 8], f16), T([448], f16), T([448], f16), T([448], f16), T([448], f16), True, 0.1, 0.001), {})
Operator: aten.native_batch_norm_backward.default
cnt: 3, ((T([128, 192, 8, 8], f16), T([128, 192, 8, 8], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 0.001, [True, True, True]), {})
cnt: 12, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f32), T([384], f32), True, 0.001, [True, True, True]), {})
cnt: 2, ((T([128, 448, 8, 8], f16), T([128, 448, 8, 8], f16), T([448], f16), T([448], f16), T([448], f16), T([448], f32), T([448], f32), True, 0.001, [True, True, True]), {})
cnt: 3, ((T([128, 320, 8, 8], f16), T([128, 320, 8, 8], f16), T([320], f16), T([320], f16), T([320], f16), T([320], f32), T([320], f32), True, 0.001, [True, True, True]), {})
cnt: 26, ((T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 0.001, [True, True, True]), {})
cnt: 12, ((T([128, 160, 17, 17], f16), T([128, 160, 17, 17], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f32), T([160], f32), True, 0.001, [True, True, True]), {})
cnt: 6, ((T([128, 128, 17, 17], f16), T([128, 128, 17, 17], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 96, 17, 17], f16), T([128, 96, 17, 17], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f32), T([96], f32), True, 0.001, [True, True, True]), {})
cnt: 7, ((T([128, 96, 35, 35], f16), T([128, 96, 35, 35], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f32), T([96], f32), True, 0.001, [True, True, True]), {})
cnt: 12, ((T([128, 64, 35, 35], f16), T([128, 64, 35, 35], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 384, 17, 17], f16), T([128, 384, 17, 17], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f32), T([384], f32), True, 0.001, [True, True, True]), {})
cnt: 3, ((T([128, 48, 35, 35], f16), T([128, 48, 35, 35], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f32), T([48], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 32, 35, 35], f16), T([128, 32, 35, 35], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 192, 71, 71], f16), T([128, 192, 71, 71], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([128, 80, 73, 73], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f32), T([80], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 64, 147, 147], f16), T([128, 64, 147, 147], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([128, 32, 147, 147], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 0.001, [True, True, True]), {})
cnt: 1, ((T([128, 32, 149, 149], f16), T([128, 32, 149, 149], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 0.001, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 1, ((T([128, 32, 149, 149], f16),), {})
cnt: 1, ((T([128, 32, 147, 147], f16),), {})
cnt: 1, ((T([128, 64, 147, 147], f16),), {})
cnt: 1, ((T([128, 80, 73, 73], f16),), {})
cnt: 1, ((T([128, 192, 71, 71], f16),), {})
cnt: 12, ((T([128, 64, 35, 35], f16),), {})
cnt: 3, ((T([128, 48, 35, 35], f16),), {})
cnt: 7, ((T([128, 96, 35, 35], f16),), {})
cnt: 1, ((T([128, 32, 35, 35], f16),), {})
cnt: 1, ((T([128, 384, 17, 17], f16),), {})
cnt: 1, ((T([128, 96, 17, 17], f16),), {})
cnt: 26, ((T([128, 192, 17, 17], f16),), {})
cnt: 6, ((T([128, 128, 17, 17], f16),), {})
cnt: 12, ((T([128, 160, 17, 17], f16),), {})
cnt: 3, ((T([128, 320, 8, 8], f16),), {})
cnt: 3, ((T([128, 192, 8, 8], f16),), {})
cnt: 12, ((T([128, 384, 8, 8], f16),), {})
cnt: 2, ((T([128, 448, 8, 8], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 2, ((T([128, 192, 8, 8], f16, stride=(131072, 64, 8, 1)), T([128, 192, 8, 8], f16), 0), {})
cnt: 8, ((T([128, 384, 8, 8], f16, stride=(131072, 64, 8, 1)), T([128, 384, 8, 8], f16), 0), {})
cnt: 4, ((T([128, 384, 8, 8], f16), T([128, 384, 8, 8], f16), 0), {})
cnt: 2, ((T([128, 448, 8, 8], f16), T([128, 448, 8, 8], f16), 0), {})
cnt: 2, ((T([128, 320, 8, 8], f16, stride=(131072, 64, 8, 1)), T([128, 320, 8, 8], f16), 0), {})
cnt: 1, ((T([128, 192, 8, 8], f16, stride=(81920, 64, 8, 1)), T([128, 192, 8, 8], f16), 0), {})
cnt: 10, ((T([128, 192, 17, 17], f16), T([128, 192, 17, 17], f16), 0), {})
cnt: 1, ((T([128, 320, 8, 8], f16, stride=(81920, 64, 8, 1)), T([128, 320, 8, 8], f16), 0), {})
cnt: 16, ((T([128, 192, 17, 17], f16, stride=(221952, 289, 17, 1)), T([128, 192, 17, 17], f16), 0), {})
cnt: 12, ((T([128, 160, 17, 17], f16), T([128, 160, 17, 17], f16), 0), {})
cnt: 6, ((T([128, 128, 17, 17], f16), T([128, 128, 17, 17], f16), 0), {})
cnt: 1, ((T([128, 96, 17, 17], f16, stride=(221952, 289, 17, 1)), T([128, 96, 17, 17], f16), 0), {})
cnt: 4, ((T([128, 96, 35, 35], f16), T([128, 96, 35, 35], f16), 0), {})
cnt: 4, ((T([128, 64, 35, 35], f16), T([128, 64, 35, 35], f16), 0), {})
cnt: 1, ((T([128, 384, 17, 17], f16, stride=(221952, 289, 17, 1)), T([128, 384, 17, 17], f16), 0), {})
cnt: 6, ((T([128, 64, 35, 35], f16, stride=(352800, 1225, 35, 1)), T([128, 64, 35, 35], f16), 0), {})
cnt: 2, ((T([128, 96, 35, 35], f16, stride=(352800, 1225, 35, 1)), T([128, 96, 35, 35], f16), 0), {})
cnt: 3, ((T([128, 48, 35, 35], f16), T([128, 48, 35, 35], f16), 0), {})
cnt: 1, ((T([128, 32, 35, 35], f16, stride=(313600, 1225, 35, 1)), T([128, 32, 35, 35], f16), 0), {})
cnt: 1, ((T([128, 96, 35, 35], f16, stride=(313600, 1225, 35, 1)), T([128, 96, 35, 35], f16), 0), {})
cnt: 2, ((T([128, 64, 35, 35], f16, stride=(313600, 1225, 35, 1)), T([128, 64, 35, 35], f16), 0), {})
cnt: 1, ((T([128, 192, 71, 71], f16), T([128, 192, 71, 71], f16), 0), {})
cnt: 1, ((T([128, 80, 73, 73], f16), T([128, 80, 73, 73], f16), 0), {})
cnt: 1, ((T([128, 64, 147, 147], f16), T([128, 64, 147, 147], f16), 0), {})
cnt: 1, ((T([128, 32, 147, 147], f16), T([128, 32, 147, 147], f16), 0), {})
cnt: 1, ((T([128, 32, 149, 149], f16), T([128, 32, 149, 149], f16), 0), {})

View File

@ -0,0 +1,269 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([64, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 2, ((T([64, 4, 16, 196, 196], f16), -1, False), {})
cnt: 2, ((T([64, 8, 4, 196, 196], f16), -1, False), {})
cnt: 20, ((T([64, 16, 1, 196, 196], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 20, ((T([64, 16, 1, 196, 196], f16), T([64, 16, 1, 196, 196], f16), -1, f16), {})
cnt: 2, ((T([64, 8, 4, 196, 196], f16), T([64, 8, 4, 196, 196], f16), -1, f16), {})
cnt: 2, ((T([64, 4, 16, 196, 196], f16), T([64, 4, 16, 196, 196], f16), -1, f16), {})
Operator: aten._unsafe_view.default
cnt: 2, ((T([64, 4, 4, 14, 14, 128], f16), [64, 16, 196, 128]), {})
cnt: 2, ((T([200704, 384], f16), [64, 16, 196, 384]), {})
cnt: 6, ((T([64, 4, 16, 196, 32], f16), [4096, 196, 32]), {})
cnt: 2, ((T([64, 4, 16, 32, 196], f16), [4096, 32, 196]), {})
cnt: 2, ((T([4096, 196, 196], f16), [64, 4, 16, 196, 196]), {})
cnt: 2, ((T([4096, 196, 32], f16), [64, 4, 16, 196, 32]), {})
cnt: 2, ((T([64, 16, 196, 32, 4], f16), [64, 16, 196, 128]), {})
cnt: 4, ((T([200704, 128], f16), [64, 16, 196, 128]), {})
cnt: 2, ((T([200704, 512], f16), [64, 16, 196, 512]), {})
cnt: 2, ((T([64, 4, 14, 4, 14, 128], f16), [64, 56, 56, 128]), {})
cnt: 2, ((T([64, 2, 2, 14, 14, 256], f16), [64, 4, 196, 256]), {})
cnt: 2, ((T([50176, 768], f16), [64, 4, 196, 768]), {})
cnt: 6, ((T([64, 8, 4, 196, 32], f16), [2048, 196, 32]), {})
cnt: 2, ((T([64, 8, 4, 32, 196], f16), [2048, 32, 196]), {})
cnt: 2, ((T([2048, 196, 196], f16), [64, 8, 4, 196, 196]), {})
cnt: 2, ((T([2048, 196, 32], f16), [64, 8, 4, 196, 32]), {})
cnt: 2, ((T([64, 4, 196, 32, 8], f16), [64, 4, 196, 256]), {})
cnt: 4, ((T([50176, 256], f16), [64, 4, 196, 256]), {})
cnt: 2, ((T([50176, 1024], f16), [64, 4, 196, 1024]), {})
cnt: 2, ((T([64, 2, 14, 2, 14, 256], f16), [64, 28, 28, 256]), {})
cnt: 20, ((T([12544, 1536], f16), [64, 1, 196, 1536]), {})
cnt: 60, ((T([64, 16, 1, 196, 32], f16), [1024, 196, 32]), {})
cnt: 20, ((T([64, 16, 1, 32, 196], f16), [1024, 32, 196]), {})
cnt: 20, ((T([1024, 196, 196], f16), [64, 16, 1, 196, 196]), {})
cnt: 20, ((T([1024, 196, 32], f16), [64, 16, 1, 196, 32]), {})
cnt: 20, ((T([64, 1, 196, 32, 16], f16), [64, 1, 196, 512]), {})
cnt: 40, ((T([12544, 512], f16), [64, 1, 196, 512]), {})
cnt: 20, ((T([12544, 2048], f16), [64, 1, 196, 2048]), {})
cnt: 40, ((T([64, 1, 196, 512], f16), [12544, 512]), {})
cnt: 20, ((T([64, 1, 196, 3, 16, 32], f16), [64, 1, 196, 1536]), {})
cnt: 2, ((T([64, 4, 196, 3, 8, 32], f16), [64, 4, 196, 768]), {})
cnt: 2, ((T([64, 16, 196, 3, 4, 32], f16), [64, 16, 196, 384]), {})
Operator: aten.add.Tensor
cnt: 1, ((T([64, 16, 196, 128], f16), T([1, 16, 196, 128], f16)), {})
cnt: 2, ((T([64, 16, 196, 384], f16), T([384], f16)), {})
cnt: 4, ((T([64, 16, 196, 128], f16), T([128], f16)), {})
cnt: 8, ((T([64, 16, 196, 128], f16), T([64, 16, 196, 128], f16)), {})
cnt: 2, ((T([64, 16, 196, 512], f16), T([512], f16)), {})
cnt: 1, ((T([64, 4, 196, 256], f16), T([1, 4, 196, 256], f16)), {})
cnt: 2, ((T([64, 4, 196, 768], f16), T([768], f16)), {})
cnt: 4, ((T([64, 4, 196, 256], f16), T([256], f16)), {})
cnt: 8, ((T([64, 4, 196, 256], f16), T([64, 4, 196, 256], f16)), {})
cnt: 2, ((T([64, 4, 196, 1024], f16), T([1024], f16)), {})
cnt: 1, ((T([64, 1, 196, 512], f16), T([1, 1, 196, 512], f16)), {})
cnt: 20, ((T([64, 1, 196, 1536], f16), T([1536], f16)), {})
cnt: 40, ((T([64, 1, 196, 512], f16), T([512], f16)), {})
cnt: 40, ((T([64, 1, 196, 512], f16), T([64, 1, 196, 512], f16)), {})
cnt: 20, ((T([64, 1, 196, 2048], f16), T([2048], f16)), {})
cnt: 40, ((T([64, 1, 196, 512], f16, stride=(100352, 196, 1, 196)), T([64, 1, 196, 512], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([64, 512], f16), T([512, 1000], f16, stride=(1, 512))), {})
Operator: aten.as_strided_.default
cnt: 1, ((T([64, 512, 1, 1], f16), [64, 512, 1, 1], [512, 1, 512, 512]), {})
Operator: aten.bernoulli_.float
cnt: 2, ((T([64, 1, 1, 1], f16), 0.9782608691602945), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.9565217383205891), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.9347826093435287), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.9130434766411781), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.8913043439388275), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.8695652186870575), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.8478260785341263), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.8260869532823563), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.8043478280305862), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.782608687877655), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.760869562625885), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.739130437374115), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.717391312122345), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.695652186870575), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.6739130318164825), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.6521739065647125), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.6304347813129425), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.6086956560611725), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.5869565308094025), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.5652174055576324), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.54347825050354), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.52173912525177), {})
cnt: 2, ((T([64, 1, 1, 1], f16),), {})
Operator: aten.bmm.default
cnt: 2, ((T([4096, 196, 32], f16), T([4096, 32, 196], f16)), {})
cnt: 2, ((T([4096, 196, 196], f16), T([4096, 196, 32], f16)), {})
cnt: 2, ((T([2048, 196, 32], f16), T([2048, 32, 196], f16)), {})
cnt: 2, ((T([2048, 196, 196], f16), T([2048, 196, 32], f16)), {})
cnt: 20, ((T([1024, 196, 32], f16), T([1024, 32, 196], f16)), {})
cnt: 20, ((T([1024, 196, 196], f16), T([1024, 196, 32], f16)), {})
cnt: 20, ((T([1024, 196, 196], f16, stride=(38416, 1, 196)), T([1024, 196, 32], f16)), {})
cnt: 20, ((T([1024, 196, 32], f16), T([1024, 32, 196], f16, stride=(6272, 1, 32))), {})
cnt: 20, ((T([1024, 32, 196], f16, stride=(6272, 1, 32)), T([1024, 196, 196], f16)), {})
cnt: 20, ((T([1024, 196, 196], f16), T([1024, 196, 32], f16, stride=(6272, 1, 196))), {})
cnt: 2, ((T([2048, 196, 196], f16, stride=(38416, 1, 196)), T([2048, 196, 32], f16)), {})
cnt: 2, ((T([2048, 196, 32], f16), T([2048, 32, 196], f16, stride=(6272, 1, 32))), {})
cnt: 2, ((T([2048, 32, 196], f16, stride=(6272, 1, 32)), T([2048, 196, 196], f16)), {})
cnt: 2, ((T([2048, 196, 196], f16), T([2048, 196, 32], f16, stride=(6272, 1, 196))), {})
cnt: 2, ((T([4096, 196, 196], f16, stride=(38416, 1, 196)), T([4096, 196, 32], f16)), {})
cnt: 2, ((T([4096, 196, 32], f16), T([4096, 32, 196], f16, stride=(6272, 1, 32))), {})
cnt: 2, ((T([4096, 32, 196], f16, stride=(6272, 1, 32)), T([4096, 196, 196], f16)), {})
cnt: 2, ((T([4096, 196, 196], f16), T([4096, 196, 32], f16, stride=(6272, 1, 196))), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 3, 224, 224], f16),), {})
Operator: aten.constant_pad_nd.default
cnt: 1, ((T([64, 256, 56, 56], f16, stride=(802816, 1, 14336, 256)), [0, 1, 0, 1], -inf), {})
cnt: 1, ((T([64, 512, 28, 28], f16, stride=(401408, 1, 14336, 512)), [0, 1, 0, 1], -inf), {})
cnt: 1, ((T([64, 512, 29, 29], f16, stride=(430592, 1, 14848, 512)), [0, -1, 0, -1]), {})
cnt: 1, ((T([64, 256, 57, 57], f16, stride=(831744, 1, 14592, 256)), [0, -1, 0, -1]), {})
Operator: aten.convolution.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([128, 3, 4, 4], f16), T([128], f16), [4, 4], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([256, 128, 3, 3], f16), T([256], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([512, 256, 3, 3], f16), T([512], f16), [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([64, 512, 28, 28], f16, stride=(401408, 1, 14336, 512)), T([64, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([512, 256, 3, 3], f16), [512], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([64, 256, 56, 56], f16, stride=(802816, 1, 14336, 256)), T([64, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([256, 128, 3, 3], f16), [256], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([64, 128, 56, 56], f16, stride=(401408, 1, 7168, 128)), T([64, 3, 224, 224], f16), T([128, 3, 4, 4], f16), [128], [4, 4], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([64, 3, 224, 224], f16)), {})
cnt: 1, ((T([64, 512], f16), T([64, 512], f16)), {})
cnt: 1, ((T([512, 256, 3, 3], f16), T([512, 256, 3, 3], f16, stride=(2304, 1, 768, 256))), {})
cnt: 1, ((T([256, 128, 3, 3], f16), T([256, 128, 3, 3], f16, stride=(1152, 1, 384, 128))), {})
Operator: aten.div.Scalar
cnt: 1, ((T([64, 512, 14, 14], f16, stride=(512, 1, 0, 0)), 196), {})
Operator: aten.div_.Tensor
cnt: 2, ((T([64, 1, 1, 1], f16), 0.9782608691602945), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.9565217383205891), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.9347826093435287), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.9130434766411781), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.8913043439388275), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.8695652186870575), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.8478260785341263), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.8260869532823563), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.8043478280305862), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.782608687877655), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.760869562625885), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.739130437374115), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.717391312122345), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.695652186870575), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.6739130318164825), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.6521739065647125), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.6304347813129425), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.6086956560611725), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.5869565308094025), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.5652174055576324), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.54347825050354), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.52173912525177), {})
cnt: 2, ((T([64, 1, 1, 1], f16), 0.5), {})
Operator: aten.gelu.default
cnt: 2, ((T([64, 16, 196, 512], f16),), {})
cnt: 2, ((T([64, 4, 196, 1024], f16),), {})
cnt: 20, ((T([64, 1, 196, 2048], f16),), {})
Operator: aten.gelu_backward.default
cnt: 20, ((T([64, 1, 196, 2048], f16), T([64, 1, 196, 2048], f16)), {})
cnt: 2, ((T([64, 4, 196, 1024], f16), T([64, 4, 196, 1024], f16)), {})
cnt: 2, ((T([64, 16, 196, 512], f16), T([64, 16, 196, 512], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([64], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([64, 256, 57, 57], f16, stride=(831744, 1, 14592, 256)), [3, 3], [2, 2]), {})
cnt: 1, ((T([64, 512, 29, 29], f16, stride=(430592, 1, 14848, 512)), [3, 3], [2, 2]), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([64, 512, 14, 14], f16), T([64, 512, 29, 29], f16, stride=(430592, 1, 14848, 512)), [3, 3], [2, 2], [0, 0], [1, 1], False, T([64, 512, 14, 14], i64, stride=(100352, 1, 7168, 512))), {})
cnt: 1, ((T([64, 256, 28, 28], f16, stride=(200704, 1, 7168, 256)), T([64, 256, 57, 57], f16, stride=(831744, 1, 14592, 256)), [3, 3], [2, 2], [0, 0], [1, 1], False, T([64, 256, 28, 28], i64, stride=(200704, 1, 7168, 256))), {})
Operator: aten.mean.dim
cnt: 1, ((T([64, 512, 14, 14], f16, stride=(100352, 1, 7168, 512)), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 2, ((T([200704, 128], f16), T([128, 384], f16, stride=(1, 128))), {})
cnt: 2, ((T([200704, 128], f16), T([128, 128], f16, stride=(1, 128))), {})
cnt: 2, ((T([200704, 128], f16), T([128, 512], f16, stride=(1, 128))), {})
cnt: 2, ((T([200704, 512], f16), T([512, 128], f16, stride=(1, 512))), {})
cnt: 2, ((T([50176, 256], f16), T([256, 768], f16, stride=(1, 256))), {})
cnt: 2, ((T([50176, 256], f16), T([256, 256], f16, stride=(1, 256))), {})
cnt: 2, ((T([50176, 256], f16), T([256, 1024], f16, stride=(1, 256))), {})
cnt: 2, ((T([50176, 1024], f16), T([1024, 256], f16, stride=(1, 1024))), {})
cnt: 20, ((T([12544, 512], f16), T([512, 1536], f16, stride=(1, 512))), {})
cnt: 20, ((T([12544, 512], f16), T([512, 512], f16, stride=(1, 512))), {})
cnt: 20, ((T([12544, 512], f16), T([512, 2048], f16, stride=(1, 512))), {})
cnt: 20, ((T([12544, 2048], f16), T([2048, 512], f16, stride=(1, 2048))), {})
cnt: 1, ((T([64, 1000], f16), T([1000, 512], f16)), {})
cnt: 1, ((T([1000, 64], f16, stride=(1, 1000)), T([64, 512], f16)), {})
cnt: 20, ((T([512, 12544], f16, stride=(1, 512)), T([12544, 2048], f16)), {})
cnt: 20, ((T([12544, 512], f16), T([512, 2048], f16)), {})
cnt: 20, ((T([2048, 12544], f16, stride=(1, 2048)), T([12544, 512], f16)), {})
cnt: 20, ((T([12544, 2048], f16), T([2048, 512], f16)), {})
cnt: 20, ((T([512, 12544], f16, stride=(1, 512)), T([12544, 512], f16)), {})
cnt: 20, ((T([12544, 512], f16), T([512, 512], f16)), {})
cnt: 20, ((T([1536, 12544], f16, stride=(1, 1536)), T([12544, 512], f16)), {})
cnt: 20, ((T([12544, 1536], f16), T([1536, 512], f16)), {})
cnt: 2, ((T([256, 50176], f16, stride=(1, 256)), T([50176, 1024], f16)), {})
cnt: 2, ((T([50176, 256], f16), T([256, 1024], f16)), {})
cnt: 2, ((T([1024, 50176], f16, stride=(1, 1024)), T([50176, 256], f16)), {})
cnt: 2, ((T([50176, 1024], f16), T([1024, 256], f16)), {})
cnt: 2, ((T([256, 50176], f16, stride=(1, 256)), T([50176, 256], f16)), {})
cnt: 2, ((T([50176, 256], f16), T([256, 256], f16)), {})
cnt: 2, ((T([768, 50176], f16, stride=(1, 768)), T([50176, 256], f16)), {})
cnt: 2, ((T([50176, 768], f16), T([768, 256], f16)), {})
cnt: 2, ((T([128, 200704], f16, stride=(1, 128)), T([200704, 512], f16)), {})
cnt: 2, ((T([200704, 128], f16), T([128, 512], f16)), {})
cnt: 2, ((T([512, 200704], f16, stride=(1, 512)), T([200704, 128], f16)), {})
cnt: 2, ((T([200704, 512], f16), T([512, 128], f16)), {})
cnt: 2, ((T([128, 200704], f16, stride=(1, 128)), T([200704, 128], f16)), {})
cnt: 2, ((T([200704, 128], f16), T([128, 128], f16)), {})
cnt: 2, ((T([384, 200704], f16, stride=(1, 384)), T([200704, 128], f16)), {})
cnt: 2, ((T([200704, 384], f16), T([384, 128], f16)), {})
Operator: aten.mul.Tensor
cnt: 4, ((T([64, 4, 16, 196, 196], f16), 0.1767766952966369), {})
cnt: 4, ((T([64, 16, 196, 128], f16), T([64, 1, 1, 1], f16)), {})
cnt: 4, ((T([64, 8, 4, 196, 196], f16), 0.1767766952966369), {})
cnt: 8, ((T([64, 4, 196, 256], f16), T([64, 1, 1, 1], f16)), {})
cnt: 40, ((T([64, 16, 1, 196, 196], f16), 0.1767766952966369), {})
cnt: 40, ((T([64, 1, 196, 512], f16), T([64, 1, 1, 1], f16)), {})
cnt: 40, ((T([64, 1, 196, 512], f16, stride=(100352, 196, 1, 196)), T([64, 1, 1, 1], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 4, ((T([64, 16, 196, 128], f16), [128], T([128], f16), T([128], f16), 1e-06), {})
cnt: 1, ((T([64, 56, 56, 256], f16), [256], T([256], f16), T([256], f16), 1e-06), {})
cnt: 4, ((T([64, 4, 196, 256], f16), [256], T([256], f16), T([256], f16), 1e-06), {})
cnt: 1, ((T([64, 28, 28, 512], f16), [512], T([512], f16), T([512], f16), 1e-06), {})
cnt: 40, ((T([64, 1, 196, 512], f16), [512], T([512], f16), T([512], f16), 1e-06), {})
cnt: 1, ((T([64, 14, 14, 512], f16), [512], T([512], f16), T([512], f16), 1e-06), {})
Operator: aten.native_layer_norm_backward.default
cnt: 1, ((T([64, 14, 14, 512], f16, stride=(100352, 14, 1, 196)), T([64, 14, 14, 512], f16), [512], T([64, 14, 14, 1], f32), T([64, 14, 14, 1], f32), T([512], f16), T([512], f16), [True, True, True]), {})
cnt: 40, ((T([64, 1, 196, 512], f16), T([64, 1, 196, 512], f16), [512], T([64, 1, 196, 1], f32), T([64, 1, 196, 1], f32), T([512], f16), T([512], f16), [True, True, True]), {})
cnt: 1, ((T([64, 28, 28, 512], f16), T([64, 28, 28, 512], f16), [512], T([64, 28, 28, 1], f32), T([64, 28, 28, 1], f32), T([512], f16), T([512], f16), [True, True, True]), {})
cnt: 4, ((T([64, 4, 196, 256], f16), T([64, 4, 196, 256], f16), [256], T([64, 4, 196, 1], f32), T([64, 4, 196, 1], f32), T([256], f16), T([256], f16), [True, True, True]), {})
cnt: 1, ((T([64, 56, 56, 256], f16), T([64, 56, 56, 256], f16), [256], T([64, 56, 56, 1], f32), T([64, 56, 56, 1], f32), T([256], f16), T([256], f16), [True, True, True]), {})
cnt: 4, ((T([64, 16, 196, 128], f16), T([64, 16, 196, 128], f16), [128], T([64, 16, 196, 1], f32), T([64, 16, 196, 1], f32), T([128], f16), T([128], f16), [True, True, True]), {})
Operator: aten.new_empty.default
cnt: 2, ((T([64, 16, 196, 128], f16), [64, 1, 1, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda', 'pin_memory': False})
cnt: 4, ((T([64, 4, 196, 256], f16), [64, 1, 1, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda', 'pin_memory': False})
cnt: 40, ((T([64, 1, 196, 512], f16), [64, 1, 1, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda', 'pin_memory': False})
Operator: aten.new_empty_strided.default
cnt: 1, ((T([512, 256, 3, 3], f16, stride=(2304, 1, 768, 256)), [512, 256, 3, 3], [2304, 9, 3, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([256, 128, 3, 3], f16, stride=(1152, 1, 384, 128)), [256, 128, 3, 3], [1152, 9, 3, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.new_zeros.default
cnt: 1, ((T([64, 512], f16), [32768]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([64, 1000], f16), T([64], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([64, 1000], f16), T([64], i64), None, 1, -100), {})
Operator: aten.stack.default
cnt: 20, (([T([64, 16, 1, 196, 32], f16), T([64, 16, 1, 196, 32], f16, stride=(100352, 6272, 6272, 1, 196)), T([64, 16, 1, 196, 32], f16)],), {})
cnt: 2, (([T([64, 8, 4, 196, 32], f16), T([64, 8, 4, 196, 32], f16, stride=(200704, 25088, 6272, 1, 196)), T([64, 8, 4, 196, 32], f16)],), {})
cnt: 2, (([T([64, 4, 16, 196, 32], f16), T([64, 4, 16, 196, 32], f16, stride=(401408, 100352, 6272, 1, 196)), T([64, 4, 16, 196, 32], f16)],), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([64, 1000], f16), [0], True), {})
cnt: 40, ((T([64, 1, 196, 512], f16, stride=(100352, 196, 1, 196)), [0, 1, 2], True), {})
cnt: 20, ((T([64, 1, 196, 2048], f16), [0, 1, 2], True), {})
cnt: 20, ((T([64, 1, 196, 1536], f16), [0, 1, 2], True), {})
cnt: 1, ((T([64, 1, 196, 512], f16, stride=(100352, 196, 1, 196)), [0], True), {})
cnt: 4, ((T([64, 4, 196, 256], f16), [0, 1, 2], True), {})
cnt: 2, ((T([64, 4, 196, 1024], f16), [0, 1, 2], True), {})
cnt: 2, ((T([64, 4, 196, 768], f16), [0, 1, 2], True), {})
cnt: 1, ((T([64, 4, 196, 256], f16), [0], True), {})
cnt: 4, ((T([64, 16, 196, 128], f16), [0, 1, 2], True), {})
cnt: 2, ((T([64, 16, 196, 512], f16), [0, 1, 2], True), {})
cnt: 2, ((T([64, 16, 196, 384], f16), [0, 1, 2], True), {})
cnt: 1, ((T([64, 16, 196, 128], f16), [0], True), {})
Operator: aten.unbind.int
cnt: 2, ((T([3, 64, 4, 16, 196, 32], f16, stride=(128, 1204224, 32, 75264, 384, 1)),), {})
cnt: 2, ((T([3, 64, 8, 4, 196, 32], f16, stride=(256, 602112, 32, 150528, 768, 1)),), {})
cnt: 20, ((T([3, 64, 16, 1, 196, 32], f16, stride=(512, 301056, 32, 301056, 1536, 1)),), {})

View File

@ -0,0 +1,158 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 27, ((T([], i64), 1), {})
cnt: 1, ((T([128, 256, 7, 7], f16), T([128, 256, 7, 7], f16)), {})
cnt: 1, ((T([128, 128, 7, 7], f16), T([128, 128, 7, 7], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 1280], f16), T([1280, 1000], f16, stride=(1, 1280))), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 224, 224], f16),), {})
cnt: 2, ((T([128, 8, 112, 112], f16),), {})
cnt: 1, ((T([128, 16, 112, 112], f16),), {})
cnt: 1, ((T([128, 16, 56, 56], f16),), {})
cnt: 3, ((T([128, 32, 56, 56], f16),), {})
cnt: 1, ((T([128, 32, 28, 28], f16),), {})
cnt: 3, ((T([128, 64, 28, 28], f16),), {})
cnt: 1, ((T([128, 64, 14, 14], f16),), {})
cnt: 11, ((T([128, 128, 14, 14], f16),), {})
cnt: 1, ((T([128, 128, 7, 7], f16),), {})
cnt: 3, ((T([128, 256, 7, 7], f16),), {})
cnt: 1, ((T([128, 1280, 1, 1], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([8, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 8, 112, 112], f16), T([8, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 8), {})
cnt: 1, ((T([128, 8, 112, 112], f16), T([16, 8, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([16, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 16), {})
cnt: 1, ((T([128, 16, 56, 56], f16), T([32, 16, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([32, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 32), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([32, 32, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([32, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 32), {})
cnt: 1, ((T([128, 32, 28, 28], f16), T([64, 32, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([64, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([64, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([64, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 1, ((T([128, 64, 14, 14], f16), T([128, 64, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 5, ((T([128, 128, 14, 14], f16), T([128, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 128), {})
cnt: 5, ((T([128, 128, 14, 14], f16), T([128, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 14, 14], f16), T([128, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 128), {})
cnt: 1, ((T([128, 128, 1, 1], f16), T([32, 128, 1, 1], f16), T([32], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 1, 1], f16), T([128, 32, 1, 1], f16), T([128], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 128, 7, 7], f16), T([256, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 7, 7], f16), T([256, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 256), {})
cnt: 1, ((T([128, 256, 1, 1], f16), T([64, 256, 1, 1], f16), T([64], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 1, 1], f16), T([256, 64, 1, 1], f16), T([256], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 7, 7], f16), T([256, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 256, 1, 1], f16), T([1280, 256, 1, 1], f16), T([1280], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 1280, 1, 1], f16), T([128, 256, 1, 1], f16), T([1280, 256, 1, 1], f16), [1280], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 256, 7, 7], f16), T([128, 256, 7, 7], f16), T([256, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 256, 1, 1], f16), T([128, 64, 1, 1], f16), T([256, 64, 1, 1], f16), [256], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 64, 1, 1], f16), T([128, 256, 1, 1], f16), T([64, 256, 1, 1], f16), [64], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 256, 7, 7], f16), T([128, 256, 7, 7], f16), T([256, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 256, [True, True, False]), {})
cnt: 1, ((T([128, 256, 7, 7], f16), T([128, 128, 7, 7], f16), T([256, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 128, 1, 1], f16), T([128, 32, 1, 1], f16), T([128, 32, 1, 1], f16), [128], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 32, 1, 1], f16), T([128, 128, 1, 1], f16), T([32, 128, 1, 1], f16), [32], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([128, 128, 7, 7], f16), T([128, 128, 14, 14], f16), T([128, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 128, [True, True, False]), {})
cnt: 5, ((T([128, 128, 14, 14], f16), T([128, 128, 14, 14], f16), T([128, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 5, ((T([128, 128, 14, 14], f16), T([128, 128, 14, 14], f16), T([128, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 128, [True, True, False]), {})
cnt: 1, ((T([128, 128, 14, 14], f16), T([128, 64, 14, 14], f16), T([128, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 14, 14], f16), T([128, 64, 28, 28], f16), T([64, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([128, 64, 28, 28], f16), T([64, 64, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([128, 64, 28, 28], f16), T([64, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([128, 32, 28, 28], f16), T([64, 32, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 28, 28], f16), T([128, 32, 56, 56], f16), T([32, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 32, [True, True, False]), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([128, 32, 56, 56], f16), T([32, 32, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([128, 32, 56, 56], f16), T([32, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 32, [True, True, False]), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([128, 16, 56, 56], f16), T([32, 16, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 16, 56, 56], f16), T([128, 16, 112, 112], f16), T([16, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 16, [True, True, False]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 8, 112, 112], f16), T([16, 8, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 8, 112, 112], f16), T([128, 8, 112, 112], f16), T([8, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 8, [True, True, False]), {})
cnt: 1, ((T([128, 8, 112, 112], f16), T([128, 3, 224, 224], f16), T([8, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([128, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 2, ((T([128, 256, 7, 7], f16, stride=(256, 1, 0, 0)), 49), {})
cnt: 1, ((T([128, 128, 7, 7], f16, stride=(128, 1, 0, 0)), 49), {})
Operator: aten.hardsigmoid.default
cnt: 1, ((T([128, 128, 1, 1], f16),), {})
cnt: 1, ((T([128, 256, 1, 1], f16),), {})
Operator: aten.hardsigmoid_backward.default
cnt: 1, ((T([128, 256, 1, 1], f16), T([128, 256, 1, 1], f16)), {})
cnt: 1, ((T([128, 128, 1, 1], f16), T([128, 128, 1, 1], f16)), {})
Operator: aten.hardswish_.default
cnt: 2, ((T([128, 8, 112, 112], f16),), {})
cnt: 1, ((T([128, 16, 112, 112], f16),), {})
cnt: 1, ((T([128, 16, 56, 56], f16),), {})
cnt: 3, ((T([128, 32, 56, 56], f16),), {})
cnt: 1, ((T([128, 32, 28, 28], f16),), {})
cnt: 3, ((T([128, 64, 28, 28], f16),), {})
cnt: 1, ((T([128, 64, 14, 14], f16),), {})
cnt: 11, ((T([128, 128, 14, 14], f16),), {})
cnt: 1, ((T([128, 128, 7, 7], f16),), {})
cnt: 3, ((T([128, 256, 7, 7], f16),), {})
cnt: 1, ((T([128, 1280, 1, 1], f16),), {})
Operator: aten.hardswish_backward.default
cnt: 1, ((T([128, 1280, 1, 1], f16), T([128, 1280, 1, 1], f16)), {})
cnt: 3, ((T([128, 256, 7, 7], f16), T([128, 256, 7, 7], f16)), {})
cnt: 1, ((T([128, 128, 7, 7], f16), T([128, 128, 7, 7], f16)), {})
cnt: 11, ((T([128, 128, 14, 14], f16), T([128, 128, 14, 14], f16)), {})
cnt: 1, ((T([128, 64, 14, 14], f16), T([128, 64, 14, 14], f16)), {})
cnt: 3, ((T([128, 64, 28, 28], f16), T([128, 64, 28, 28], f16)), {})
cnt: 1, ((T([128, 32, 28, 28], f16), T([128, 32, 28, 28], f16)), {})
cnt: 3, ((T([128, 32, 56, 56], f16), T([128, 32, 56, 56], f16)), {})
cnt: 1, ((T([128, 16, 56, 56], f16), T([128, 16, 56, 56], f16)), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16)), {})
cnt: 2, ((T([128, 8, 112, 112], f16), T([128, 8, 112, 112], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 128, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 256, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 256, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 1280], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 1280], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([128, 128, 7, 7], f16), T([128, 128, 1, 1], f16)), {})
cnt: 2, ((T([128, 256, 7, 7], f16), T([128, 256, 1, 1], f16)), {})
cnt: 1, ((T([128, 256, 7, 7], f16), T([128, 256, 7, 7], f16)), {})
cnt: 1, ((T([128, 128, 7, 7], f16), T([128, 128, 7, 7], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 2, ((T([128, 8, 112, 112], f16), T([8], f16), T([8], f16), T([8], f16), T([8], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 16, 56, 56], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 32, 56, 56], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 32, 28, 28], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 64, 28, 28], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 64, 14, 14], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 11, ((T([128, 128, 14, 14], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 128, 7, 7], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 256, 7, 7], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 3, ((T([128, 256, 7, 7], f16), T([128, 256, 7, 7], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 128, 7, 7], f16), T([128, 128, 7, 7], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 11, ((T([128, 128, 14, 14], f16), T([128, 128, 14, 14], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 64, 14, 14], f16), T([128, 64, 14, 14], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 64, 28, 28], f16), T([128, 64, 28, 28], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 32, 28, 28], f16), T([128, 32, 28, 28], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 32, 56, 56], f16), T([128, 32, 56, 56], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 16, 56, 56], f16), T([128, 16, 56, 56], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f32), T([16], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f32), T([16], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 8, 112, 112], f16), T([128, 8, 112, 112], f16), T([8], f16), T([8], f16), T([8], f16), T([8], f32), T([8], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 1, ((T([128, 32, 1, 1], f16),), {})
cnt: 1, ((T([128, 64, 1, 1], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
cnt: 1, ((T([128, 256, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([128, 128, 7, 7], f16), [2, 3], True), {})
Operator: aten.threshold_backward.default
cnt: 1, ((T([128, 64, 1, 1], f16), T([128, 64, 1, 1], f16), 0), {})
cnt: 1, ((T([128, 32, 1, 1], f16), T([128, 32, 1, 1], f16), 0), {})

View File

@ -0,0 +1,183 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([32, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([32, 1000], f16), T([32, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 9, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16)), {})
cnt: 24, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16)), {})
cnt: 108, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16)), {})
cnt: 8, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16)), {})
cnt: 1, ((T([32, 128, 56, 56], f16), T([32, 128, 56, 56], f16)), {})
Operator: aten.add_.Tensor
cnt: 157, ((T([], i64), 1), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([32, 2048], f16), T([2048, 1000], f16, stride=(1, 2048))), {})
Operator: aten.clone.default
cnt: 1, ((T([32, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([32, 3, 224, 224], f16), T([64, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 64, 112, 112], f16), T([64, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 64, 112, 112], f16), T([128, 64, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 128, 56, 56], f16), T([128, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 128, 56, 56], f16), T([256, 2, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 4, ((T([32, 256, 56, 56], f16), T([256, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 128, 56, 56], f16), T([256, 128, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 256, 1, 1], f16), T([16, 256, 1, 1], f16), T([16], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 16, 1, 1], f16), T([256, 16, 1, 1], f16), T([256], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([32, 256, 56, 56], f16), T([128, 256, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 256, 56, 56], f16), T([512, 4, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 9, ((T([32, 512, 28, 28], f16), T([512, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 256, 56, 56], f16), T([512, 256, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([32, 512, 1, 1], f16), T([32, 512, 1, 1], f16), T([32], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 8, ((T([32, 32, 1, 1], f16), T([512, 32, 1, 1], f16), T([512], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 7, ((T([32, 512, 28, 28], f16), T([256, 512, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 7, ((T([32, 256, 28, 28], f16), T([512, 4, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([1024, 8, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 37, ((T([32, 1024, 14, 14], f16), T([1024, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([1024, 512, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 36, ((T([32, 1024, 1, 1], f16), T([64, 1024, 1, 1], f16), T([64], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 36, ((T([32, 64, 1, 1], f16), T([1024, 64, 1, 1], f16), T([1024], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 35, ((T([32, 1024, 14, 14], f16), T([512, 1024, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 35, ((T([32, 512, 14, 14], f16), T([1024, 8, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([2048, 16, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), T([2048, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([2048, 1024, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 2048, 1, 1], f16), T([128, 2048, 1, 1], f16), T([128], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([32, 128, 1, 1], f16), T([2048, 128, 1, 1], f16), T([2048], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([32, 2048, 7, 7], f16), T([1024, 2048, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([32, 1024, 7, 7], f16), T([2048, 16, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 64), {})
Operator: aten.convolution_backward.default
cnt: 3, ((T([32, 2048, 1, 1], f16), T([32, 128, 1, 1], f16), T([2048, 128, 1, 1], f16), [2048], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([32, 128, 1, 1], f16), T([32, 2048, 1, 1], f16), T([128, 2048, 1, 1], f16), [128], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16), T([2048, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([32, 2048, 7, 7], f16), T([32, 1024, 7, 7], f16), T([2048, 16, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 2, ((T([32, 1024, 7, 7], f16), T([32, 2048, 7, 7], f16), T([1024, 2048, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 2048, 7, 7], f16), T([32, 1024, 14, 14], f16), T([2048, 1024, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 2048, 7, 7], f16), T([32, 1024, 14, 14], f16), T([2048, 16, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 37, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16), T([1024, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 36, ((T([32, 1024, 1, 1], f16), T([32, 64, 1, 1], f16), T([1024, 64, 1, 1], f16), [1024], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 36, ((T([32, 64, 1, 1], f16), T([32, 1024, 1, 1], f16), T([64, 1024, 1, 1], f16), [64], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 35, ((T([32, 1024, 14, 14], f16), T([32, 512, 14, 14], f16), T([1024, 8, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 35, ((T([32, 512, 14, 14], f16), T([32, 1024, 14, 14], f16), T([512, 1024, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 512, 28, 28], f16), T([1024, 512, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 1024, 14, 14], f16), T([32, 512, 28, 28], f16), T([1024, 8, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 9, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16), T([512, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 8, ((T([32, 512, 1, 1], f16), T([32, 32, 1, 1], f16), T([512, 32, 1, 1], f16), [512], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 8, ((T([32, 32, 1, 1], f16), T([32, 512, 1, 1], f16), T([32, 512, 1, 1], f16), [32], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 7, ((T([32, 512, 28, 28], f16), T([32, 256, 28, 28], f16), T([512, 4, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 7, ((T([32, 256, 28, 28], f16), T([32, 512, 28, 28], f16), T([256, 512, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([32, 256, 56, 56], f16), T([512, 256, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 512, 28, 28], f16), T([32, 256, 56, 56], f16), T([512, 4, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 4, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16), T([256, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([32, 256, 1, 1], f16), T([32, 16, 1, 1], f16), T([256, 16, 1, 1], f16), [256], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([32, 16, 1, 1], f16), T([32, 256, 1, 1], f16), T([16, 256, 1, 1], f16), [16], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([32, 256, 56, 56], f16), T([32, 128, 56, 56], f16), T([256, 2, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 2, ((T([32, 128, 56, 56], f16), T([32, 256, 56, 56], f16), T([128, 256, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 256, 56, 56], f16), T([32, 128, 56, 56], f16), T([256, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 128, 56, 56], f16), T([32, 128, 56, 56], f16), T([128, 128, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 128, 112, 112], f16), T([32, 64, 112, 112], f16), T([128, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 64, 112, 112], f16), T([32, 64, 112, 112], f16), T([64, 64, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([32, 64, 112, 112], f16), T([32, 3, 224, 224], f16), T([64, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([32, 3, 224, 224], f16), T([32, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 4, ((T([32, 2048, 7, 7], f16, stride=(2048, 1, 0, 0)), 49), {})
cnt: 36, ((T([32, 1024, 14, 14], f16, stride=(1024, 1, 0, 0)), 196), {})
cnt: 8, ((T([32, 512, 28, 28], f16, stride=(512, 1, 0, 0)), 784), {})
cnt: 3, ((T([32, 256, 56, 56], f16, stride=(256, 1, 0, 0)), 3136), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([32], i64),), {})
Operator: aten.max_pool2d_with_indices.default
cnt: 1, ((T([32, 128, 112, 112], f16), [3, 3], [2, 2], [0, 0], [1, 1], True), {})
Operator: aten.max_pool2d_with_indices_backward.default
cnt: 1, ((T([32, 128, 56, 56], f16), T([32, 128, 112, 112], f16), [3, 3], [2, 2], [0, 0], [1, 1], True, T([32, 128, 56, 56], i64)), {})
Operator: aten.mean.dim
cnt: 3, ((T([32, 256, 56, 56], f16), [2, 3], True), {})
cnt: 8, ((T([32, 512, 28, 28], f16), [2, 3], True), {})
cnt: 36, ((T([32, 1024, 14, 14], f16), [2, 3], True), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([32, 2048, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([32, 1000], f16), T([1000, 2048], f16)), {})
cnt: 1, ((T([1000, 32], f16, stride=(1, 1000)), T([32, 2048], f16)), {})
Operator: aten.mul.Tensor
cnt: 6, ((T([32, 256, 56, 56], f16), T([32, 256, 1, 1], f16)), {})
cnt: 16, ((T([32, 512, 28, 28], f16), T([32, 512, 1, 1], f16)), {})
cnt: 72, ((T([32, 1024, 14, 14], f16), T([32, 1024, 1, 1], f16)), {})
cnt: 6, ((T([32, 2048, 7, 7], f16), T([32, 2048, 1, 1], f16)), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16)), {})
cnt: 36, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16)), {})
cnt: 8, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16)), {})
cnt: 3, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 2, ((T([32, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([32, 128, 112, 112], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([32, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 8, ((T([32, 256, 56, 56], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 18, ((T([32, 512, 28, 28], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 7, ((T([32, 256, 28, 28], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 74, ((T([32, 1024, 14, 14], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
cnt: 35, ((T([32, 512, 14, 14], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 7, ((T([32, 2048, 7, 7], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([32, 1024, 7, 7], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 7, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16), T([2048], f16), T([2048], f16), T([2048], f16), T([2048], f32), T([2048], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([32, 1024, 7, 7], f16), T([32, 1024, 7, 7], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 74, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16), T([1024], f16), T([1024], f16), T([1024], f16), T([1024], f32), T([1024], f32), True, 1e-05, [True, True, True]), {})
cnt: 35, ((T([32, 512, 14, 14], f16), T([32, 512, 14, 14], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 18, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 7, ((T([32, 256, 28, 28], f16), T([32, 256, 28, 28], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 8, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([32, 128, 56, 56], f16), T([32, 128, 56, 56], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([32, 128, 112, 112], f16), T([32, 128, 112, 112], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([32, 64, 112, 112], f16), T([32, 64, 112, 112], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([32, 1000], f16), T([32], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([32, 1000], f16), T([32], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 2, ((T([32, 64, 112, 112], f16),), {})
cnt: 1, ((T([32, 128, 112, 112], f16),), {})
cnt: 3, ((T([32, 128, 56, 56], f16),), {})
cnt: 7, ((T([32, 256, 56, 56], f16),), {})
cnt: 3, ((T([32, 16, 1, 1], f16),), {})
cnt: 17, ((T([32, 512, 28, 28], f16),), {})
cnt: 8, ((T([32, 32, 1, 1], f16),), {})
cnt: 7, ((T([32, 256, 28, 28], f16),), {})
cnt: 73, ((T([32, 1024, 14, 14], f16),), {})
cnt: 36, ((T([32, 64, 1, 1], f16),), {})
cnt: 35, ((T([32, 512, 14, 14], f16),), {})
cnt: 6, ((T([32, 2048, 7, 7], f16),), {})
cnt: 3, ((T([32, 128, 1, 1], f16),), {})
cnt: 2, ((T([32, 1024, 7, 7], f16),), {})
Operator: aten.sigmoid.default
cnt: 3, ((T([32, 256, 1, 1], f16),), {})
cnt: 8, ((T([32, 512, 1, 1], f16),), {})
cnt: 36, ((T([32, 1024, 1, 1], f16),), {})
cnt: 3, ((T([32, 2048, 1, 1], f16),), {})
Operator: aten.sigmoid_backward.default
cnt: 3, ((T([32, 2048, 1, 1], f16), T([32, 2048, 1, 1], f16)), {})
cnt: 36, ((T([32, 1024, 1, 1], f16), T([32, 1024, 1, 1], f16)), {})
cnt: 8, ((T([32, 512, 1, 1], f16), T([32, 512, 1, 1], f16)), {})
cnt: 3, ((T([32, 256, 1, 1], f16), T([32, 256, 1, 1], f16)), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([32, 1000], f16), [0], True), {})
cnt: 3, ((T([32, 2048, 7, 7], f16), [2, 3], True), {})
cnt: 36, ((T([32, 1024, 14, 14], f16), [2, 3], True), {})
cnt: 8, ((T([32, 512, 28, 28], f16), [2, 3], True), {})
cnt: 3, ((T([32, 256, 56, 56], f16), [2, 3], True), {})
Operator: aten.threshold_backward.default
cnt: 6, ((T([32, 2048, 7, 7], f16), T([32, 2048, 7, 7], f16), 0), {})
cnt: 3, ((T([32, 128, 1, 1], f16), T([32, 128, 1, 1], f16), 0), {})
cnt: 2, ((T([32, 1024, 7, 7], f16), T([32, 1024, 7, 7], f16), 0), {})
cnt: 73, ((T([32, 1024, 14, 14], f16), T([32, 1024, 14, 14], f16), 0), {})
cnt: 36, ((T([32, 64, 1, 1], f16), T([32, 64, 1, 1], f16), 0), {})
cnt: 35, ((T([32, 512, 14, 14], f16), T([32, 512, 14, 14], f16), 0), {})
cnt: 17, ((T([32, 512, 28, 28], f16), T([32, 512, 28, 28], f16), 0), {})
cnt: 8, ((T([32, 32, 1, 1], f16), T([32, 32, 1, 1], f16), 0), {})
cnt: 7, ((T([32, 256, 28, 28], f16), T([32, 256, 28, 28], f16), 0), {})
cnt: 7, ((T([32, 256, 56, 56], f16), T([32, 256, 56, 56], f16), 0), {})
cnt: 3, ((T([32, 16, 1, 1], f16), T([32, 16, 1, 1], f16), 0), {})
cnt: 3, ((T([32, 128, 56, 56], f16), T([32, 128, 56, 56], f16), 0), {})
cnt: 1, ((T([32, 128, 112, 112], f16), T([32, 128, 112, 112], f16), 0), {})
cnt: 2, ((T([32, 64, 112, 112], f16), T([32, 64, 112, 112], f16), 0), {})

View File

@ -0,0 +1,295 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten._softmax.default
cnt: 4, ((T([128, 4, 196, 196], f16), -1, False), {})
cnt: 1, ((T([128, 8, 49, 196], f16), -1, False), {})
cnt: 4, ((T([128, 8, 49, 49], f16), -1, False), {})
cnt: 1, ((T([128, 16, 16, 49], f16), -1, False), {})
cnt: 4, ((T([128, 12, 16, 16], f16), -1, False), {})
Operator: aten._softmax_backward_data.default
cnt: 4, ((T([128, 12, 16, 16], f16), T([128, 12, 16, 16], f16), -1, f16), {})
cnt: 1, ((T([128, 16, 16, 49], f16), T([128, 16, 16, 49], f16), -1, f16), {})
cnt: 4, ((T([128, 8, 49, 49], f16), T([128, 8, 49, 49], f16), -1, f16), {})
cnt: 1, ((T([128, 8, 49, 196], f16), T([128, 8, 49, 196], f16), -1, f16), {})
cnt: 4, ((T([128, 4, 196, 196], f16), T([128, 4, 196, 196], f16), -1, f16), {})
Operator: aten._unsafe_view.default
cnt: 8, ((T([128, 196, 256], f16), [128, 196, 256]), {})
cnt: 4, ((T([128, 4, 196, 16], f16), [512, 196, 16]), {})
cnt: 4, ((T([128, 4, 16, 196], f16), [512, 16, 196]), {})
cnt: 4, ((T([512, 196, 196], f16), [128, 4, 196, 196]), {})
cnt: 8, ((T([128, 4, 196, 32], f16), [512, 196, 32]), {})
cnt: 4, ((T([512, 196, 32], f16), [128, 4, 196, 32]), {})
cnt: 4, ((T([128, 196, 4, 32], f16), [128, 196, 128]), {})
cnt: 8, ((T([25088, 128], f16), [128, 196, 128]), {})
cnt: 1, ((T([128, 196, 640], f16), [128, 196, 640]), {})
cnt: 1, ((T([128, 7, 7, 128], f16), [128, 49, 128]), {})
cnt: 1, ((T([6272, 128], f16), [128, 49, 128]), {})
cnt: 5, ((T([128, 8, 49, 16], f16), [1024, 49, 16]), {})
cnt: 1, ((T([128, 8, 16, 196], f16), [1024, 16, 196]), {})
cnt: 1, ((T([1024, 49, 196], f16), [128, 8, 49, 196]), {})
cnt: 1, ((T([128, 8, 196, 64], f16), [1024, 196, 64]), {})
cnt: 1, ((T([1024, 49, 64], f16), [128, 8, 49, 64]), {})
cnt: 1, ((T([128, 49, 8, 64], f16), [128, 49, 512]), {})
cnt: 10, ((T([6272, 256], f16), [128, 49, 256]), {})
cnt: 9, ((T([6272, 512], f16), [128, 49, 512]), {})
cnt: 4, ((T([128, 8, 16, 49], f16), [1024, 16, 49]), {})
cnt: 4, ((T([1024, 49, 49], f16), [128, 8, 49, 49]), {})
cnt: 8, ((T([128, 8, 49, 32], f16), [1024, 49, 32]), {})
cnt: 4, ((T([1024, 49, 32], f16), [128, 8, 49, 32]), {})
cnt: 4, ((T([128, 49, 8, 32], f16), [128, 49, 256]), {})
cnt: 1, ((T([6272, 1280], f16), [128, 49, 1280]), {})
cnt: 1, ((T([128, 4, 4, 256], f16), [128, 16, 256]), {})
cnt: 1, ((T([2048, 256], f16), [128, 16, 256]), {})
cnt: 1, ((T([128, 16, 16, 16], f16), [2048, 16, 16]), {})
cnt: 1, ((T([128, 16, 16, 49], f16), [2048, 16, 49]), {})
cnt: 1, ((T([2048, 16, 49], f16), [128, 16, 16, 49]), {})
cnt: 1, ((T([128, 16, 49, 64], f16), [2048, 49, 64]), {})
cnt: 1, ((T([2048, 16, 64], f16), [128, 16, 16, 64]), {})
cnt: 1, ((T([128, 16, 16, 64], f16), [128, 16, 1024]), {})
cnt: 10, ((T([2048, 384], f16), [128, 16, 384]), {})
cnt: 9, ((T([2048, 768], f16), [128, 16, 768]), {})
cnt: 8, ((T([128, 12, 16, 16], f16), [1536, 16, 16]), {})
cnt: 4, ((T([1536, 16, 16], f16), [128, 12, 16, 16]), {})
cnt: 8, ((T([128, 12, 16, 32], f16), [1536, 16, 32]), {})
cnt: 4, ((T([1536, 16, 32], f16), [128, 12, 16, 32]), {})
cnt: 4, ((T([128, 16, 12, 32], f16), [128, 16, 384]), {})
cnt: 1, ((T([128, 16, 16, 64], f16), [2048, 16, 64]), {})
cnt: 1, ((T([128, 16, 16, 16], f16), [128, 16, 256]), {})
cnt: 1, ((T([128, 8, 49, 64], f16), [1024, 49, 64]), {})
cnt: 1, ((T([128, 49, 8, 16], f16), [128, 49, 128]), {})
Operator: aten.add.Tensor
cnt: 4, ((T([128, 4, 196, 196], f16), T([4, 196, 196], f16)), {})
cnt: 8, ((T([128, 196, 128], f16, stride=(25088, 1, 196)), T([128, 196, 128], f16)), {})
cnt: 1, ((T([128, 8, 49, 196], f16), T([8, 49, 196], f16)), {})
cnt: 19, ((T([128, 49, 256], f16), T([128, 49, 256], f16)), {})
cnt: 4, ((T([128, 8, 49, 49], f16), T([8, 49, 49], f16)), {})
cnt: 1, ((T([128, 16, 16, 49], f16), T([16, 16, 49], f16)), {})
cnt: 18, ((T([128, 16, 384], f16), T([128, 16, 384], f16)), {})
cnt: 4, ((T([128, 12, 16, 16], f16), T([12, 16, 16], f16)), {})
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16)), {})
cnt: 1, ((T([128, 384], f16), T([128, 384], f16)), {})
cnt: 9, ((T([128, 196, 128], f16), T([128, 196, 128], f16)), {})
Operator: aten.add_.Tensor
cnt: 64, ((T([], i64), 1), {})
Operator: aten.addmm.default
cnt: 2, ((T([1000], f16), T([128, 384], f16), T([384, 1000], f16, stride=(1, 384))), {})
Operator: aten.bmm.default
cnt: 8, ((T([128, 196, 128], f16, stride=(25088, 1, 196)), T([128, 128, 256], f16, stride=(0, 1, 128))), {})
cnt: 4, ((T([512, 196, 16], f16), T([512, 16, 196], f16)), {})
cnt: 4, ((T([512, 196, 196], f16), T([512, 196, 32], f16)), {})
cnt: 1, ((T([128, 196, 128], f16, stride=(25088, 1, 196)), T([128, 128, 640], f16, stride=(0, 1, 128))), {})
cnt: 1, ((T([1024, 49, 16], f16), T([1024, 16, 196], f16)), {})
cnt: 1, ((T([1024, 49, 196], f16), T([1024, 196, 64], f16)), {})
cnt: 4, ((T([1024, 49, 16], f16), T([1024, 16, 49], f16)), {})
cnt: 4, ((T([1024, 49, 49], f16), T([1024, 49, 32], f16)), {})
cnt: 1, ((T([2048, 16, 16], f16), T([2048, 16, 49], f16)), {})
cnt: 1, ((T([2048, 16, 49], f16), T([2048, 49, 64], f16)), {})
cnt: 4, ((T([1536, 16, 16], f16), T([1536, 16, 16], f16)), {})
cnt: 4, ((T([1536, 16, 16], f16), T([1536, 16, 32], f16)), {})
cnt: 4, ((T([1536, 16, 16], f16, stride=(256, 1, 16)), T([1536, 16, 32], f16)), {})
cnt: 4, ((T([1536, 16, 32], f16), T([1536, 32, 16], f16, stride=(512, 1, 32))), {})
cnt: 4, ((T([1536, 16, 16], f16, stride=(256, 1, 16)), T([1536, 16, 16], f16)), {})
cnt: 4, ((T([1536, 16, 16], f16), T([1536, 16, 16], f16, stride=(256, 1, 16))), {})
cnt: 1, ((T([2048, 49, 16], f16, stride=(784, 1, 49)), T([2048, 16, 64], f16)), {})
cnt: 1, ((T([2048, 16, 64], f16), T([2048, 64, 49], f16, stride=(3136, 1, 64))), {})
cnt: 1, ((T([2048, 16, 16], f16, stride=(256, 1, 16)), T([2048, 16, 49], f16)), {})
cnt: 1, ((T([2048, 16, 49], f16), T([2048, 49, 16], f16, stride=(784, 1, 49))), {})
cnt: 4, ((T([1024, 49, 49], f16, stride=(2401, 1, 49)), T([1024, 49, 32], f16)), {})
cnt: 4, ((T([1024, 49, 32], f16), T([1024, 32, 49], f16, stride=(1568, 1, 32))), {})
cnt: 4, ((T([1024, 16, 49], f16, stride=(784, 1, 16)), T([1024, 49, 49], f16)), {})
cnt: 4, ((T([1024, 49, 49], f16), T([1024, 49, 16], f16, stride=(784, 1, 49))), {})
cnt: 1, ((T([1024, 196, 49], f16, stride=(9604, 1, 196)), T([1024, 49, 64], f16)), {})
cnt: 1, ((T([1024, 49, 64], f16), T([1024, 64, 196], f16, stride=(12544, 1, 64))), {})
cnt: 1, ((T([1024, 16, 49], f16, stride=(784, 1, 16)), T([1024, 49, 196], f16)), {})
cnt: 1, ((T([1024, 49, 196], f16), T([1024, 196, 16], f16, stride=(3136, 1, 196))), {})
cnt: 1, ((T([128, 128, 196], f16), T([128, 196, 640], f16)), {})
cnt: 1, ((T([128, 196, 640], f16), T([128, 640, 128], f16, stride=(0, 128, 1))), {})
cnt: 8, ((T([128, 128, 196], f16), T([128, 196, 256], f16)), {})
cnt: 8, ((T([128, 196, 256], f16), T([128, 256, 128], f16, stride=(0, 128, 1))), {})
cnt: 4, ((T([512, 196, 196], f16, stride=(38416, 1, 196)), T([512, 196, 32], f16)), {})
cnt: 4, ((T([512, 196, 32], f16), T([512, 32, 196], f16, stride=(6272, 1, 32))), {})
cnt: 4, ((T([512, 16, 196], f16, stride=(3136, 1, 16)), T([512, 196, 196], f16)), {})
cnt: 4, ((T([512, 196, 196], f16), T([512, 196, 16], f16, stride=(3136, 1, 196))), {})
Operator: aten.cat.default
cnt: 4, (([T([128, 16, 12, 16], f16, stride=(3072, 16, 256, 1)), T([128, 16, 12, 16], f16, stride=(3072, 1, 256, 16)), T([128, 16, 12, 32], f16, stride=(6144, 32, 512, 1))], 3), {})
cnt: 1, (([T([128, 49, 16, 16], f16, stride=(12544, 1, 784, 49)), T([128, 49, 16, 64], f16, stride=(50176, 64, 3136, 1))], 3), {})
cnt: 4, (([T([128, 49, 8, 16], f16, stride=(6272, 16, 784, 1)), T([128, 49, 8, 16], f16, stride=(6272, 1, 784, 49)), T([128, 49, 8, 32], f16, stride=(12544, 32, 1568, 1))], 3), {})
cnt: 1, (([T([128, 196, 8, 16], f16, stride=(25088, 1, 3136, 196)), T([128, 196, 8, 64], f16, stride=(100352, 64, 12544, 1))], 3), {})
cnt: 4, (([T([128, 196, 4, 16], f16, stride=(12544, 16, 3136, 1)), T([128, 196, 4, 16], f16, stride=(12544, 1, 3136, 196)), T([128, 196, 4, 32], f16, stride=(25088, 32, 6272, 1))], 3), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([16, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([32, 16, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([64, 32, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([128, 64, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 128, 14, 14], f16, stride=(25088, 1, 1792, 128)), T([128, 64, 28, 28], f16), T([128, 64, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([128, 32, 56, 56], f16), T([64, 32, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([128, 16, 112, 112], f16), T([32, 16, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 3, 224, 224], f16), T([16, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([128, 3, 224, 224], f16)), {})
cnt: 1, ((T([640, 128], f16), T([640, 128], f16, stride=(1, 640))), {})
cnt: 8, ((T([256, 128], f16), T([256, 128], f16, stride=(1, 256))), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 16, 384], f16, stride=(384, 0, 1)), 16), {})
Operator: aten.div.Tensor
cnt: 2, ((T([128, 1000], f16), 2), {})
Operator: aten.hardswish.default
cnt: 1, ((T([128, 16, 112, 112], f16),), {})
cnt: 1, ((T([128, 32, 56, 56], f16),), {})
cnt: 1, ((T([128, 64, 28, 28], f16),), {})
cnt: 4, ((T([128, 196, 128], f16),), {})
cnt: 4, ((T([128, 196, 256], f16),), {})
cnt: 6, ((T([128, 49, 512], f16),), {})
cnt: 4, ((T([128, 49, 256], f16),), {})
cnt: 1, ((T([128, 16, 1024], f16),), {})
cnt: 5, ((T([128, 16, 768], f16),), {})
cnt: 4, ((T([128, 16, 384], f16),), {})
Operator: aten.hardswish_backward.default
cnt: 5, ((T([128, 16, 768], f16), T([128, 16, 768], f16)), {})
cnt: 4, ((T([128, 16, 384], f16), T([128, 16, 384], f16)), {})
cnt: 1, ((T([128, 16, 1024], f16), T([128, 16, 1024], f16)), {})
cnt: 6, ((T([128, 49, 512], f16), T([128, 49, 512], f16)), {})
cnt: 4, ((T([128, 49, 256], f16), T([128, 49, 256], f16)), {})
cnt: 4, ((T([128, 196, 256], f16), T([128, 196, 256], f16)), {})
cnt: 4, ((T([128, 196, 128], f16), T([128, 196, 128], f16)), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([128, 64, 28, 28], f16)), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([128, 32, 56, 56], f16)), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16)), {})
Operator: aten.index.Tensor
cnt: 4, ((T([4, 196], f16), [None, T([196, 196], i64)]), {})
cnt: 1, ((T([8, 196], f16), [None, T([49, 196], i64)]), {})
cnt: 4, ((T([8, 49], f16), [None, T([49, 49], i64)]), {})
cnt: 1, ((T([16, 49], f16), [None, T([16, 49], i64)]), {})
cnt: 4, ((T([12, 16], f16), [None, T([16, 16], i64)]), {})
Operator: aten.index_put.default
cnt: 4, ((T([12, 16], f16), [None, T([16, 16], i64)], T([12, 16, 16], f16), True), {})
cnt: 1, ((T([16, 49], f16), [None, T([16, 49], i64)], T([16, 16, 49], f16), True), {})
cnt: 4, ((T([8, 49], f16), [None, T([49, 49], i64)], T([8, 49, 49], f16), True), {})
cnt: 1, ((T([8, 196], f16), [None, T([49, 196], i64)], T([8, 49, 196], f16), True), {})
cnt: 4, ((T([4, 196], f16), [None, T([196, 196], i64)], T([4, 196, 196], f16), True), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 16, 384], f16), [1]), {})
Operator: aten.mm.default
cnt: 4, ((T([25088, 128], f16), T([128, 128], f16, stride=(1, 128))), {})
cnt: 4, ((T([25088, 256], f16), T([256, 128], f16, stride=(1, 256))), {})
cnt: 1, ((T([6272, 128], f16), T([128, 128], f16, stride=(1, 128))), {})
cnt: 6, ((T([6272, 512], f16), T([512, 256], f16, stride=(1, 512))), {})
cnt: 9, ((T([6272, 256], f16), T([256, 512], f16, stride=(1, 256))), {})
cnt: 4, ((T([6272, 256], f16), T([256, 256], f16, stride=(1, 256))), {})
cnt: 1, ((T([6272, 256], f16), T([256, 1280], f16, stride=(1, 256))), {})
cnt: 1, ((T([2048, 256], f16), T([256, 256], f16, stride=(1, 256))), {})
cnt: 1, ((T([2048, 1024], f16), T([1024, 384], f16, stride=(1, 1024))), {})
cnt: 9, ((T([2048, 384], f16), T([384, 768], f16, stride=(1, 384))), {})
cnt: 5, ((T([2048, 768], f16), T([768, 384], f16, stride=(1, 768))), {})
cnt: 4, ((T([2048, 384], f16), T([384, 384], f16, stride=(1, 384))), {})
cnt: 2, ((T([128, 1000], f16), T([1000, 384], f16)), {})
cnt: 2, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 384], f16)), {})
cnt: 5, ((T([384, 2048], f16, stride=(1, 384)), T([2048, 768], f16)), {})
cnt: 5, ((T([2048, 384], f16), T([384, 768], f16)), {})
cnt: 9, ((T([768, 2048], f16, stride=(1, 768)), T([2048, 384], f16)), {})
cnt: 9, ((T([2048, 768], f16), T([768, 384], f16)), {})
cnt: 4, ((T([384, 2048], f16, stride=(1, 384)), T([2048, 384], f16)), {})
cnt: 4, ((T([2048, 384], f16), T([384, 384], f16)), {})
cnt: 1, ((T([384, 2048], f16, stride=(1, 384)), T([2048, 1024], f16)), {})
cnt: 1, ((T([2048, 384], f16), T([384, 1024], f16)), {})
cnt: 1, ((T([256, 2048], f16, stride=(1, 256)), T([2048, 256], f16)), {})
cnt: 1, ((T([2048, 256], f16), T([256, 256], f16)), {})
cnt: 1, ((T([1280, 6272], f16, stride=(1, 1280)), T([6272, 256], f16)), {})
cnt: 1, ((T([6272, 1280], f16), T([1280, 256], f16)), {})
cnt: 6, ((T([256, 6272], f16, stride=(1, 256)), T([6272, 512], f16)), {})
cnt: 6, ((T([6272, 256], f16), T([256, 512], f16)), {})
cnt: 9, ((T([512, 6272], f16, stride=(1, 512)), T([6272, 256], f16)), {})
cnt: 9, ((T([6272, 512], f16), T([512, 256], f16)), {})
cnt: 4, ((T([256, 6272], f16, stride=(1, 256)), T([6272, 256], f16)), {})
cnt: 4, ((T([6272, 256], f16), T([256, 256], f16)), {})
cnt: 1, ((T([128, 6272], f16, stride=(1, 128)), T([6272, 128], f16)), {})
cnt: 1, ((T([6272, 128], f16), T([128, 128], f16)), {})
cnt: 4, ((T([128, 25088], f16, stride=(1, 128)), T([25088, 256], f16)), {})
cnt: 4, ((T([25088, 128], f16), T([128, 256], f16)), {})
cnt: 4, ((T([128, 25088], f16, stride=(1, 128)), T([25088, 128], f16)), {})
cnt: 4, ((T([25088, 128], f16), T([128, 128], f16)), {})
Operator: aten.mul.Tensor
cnt: 8, ((T([128, 4, 196, 196], f16), 0.25), {})
cnt: 2, ((T([128, 8, 49, 196], f16), 0.25), {})
cnt: 8, ((T([128, 8, 49, 49], f16), 0.25), {})
cnt: 2, ((T([128, 16, 16, 49], f16), 0.25), {})
cnt: 8, ((T([128, 12, 16, 16], f16), 0.25), {})
Operator: aten.native_batch_norm.default
cnt: 1, ((T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 128, 14, 14], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 8, ((T([25088, 256], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 8, ((T([25088, 128], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([25088, 640], f16), T([640], f16), T([640], f16), T([640], f16), T([640], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([6272, 128], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f16), True, 0.1, 1e-05), {})
cnt: 10, ((T([6272, 256], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 9, ((T([6272, 512], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([6272, 1280], f16), T([1280], f16), T([1280], f16), T([1280], f16), T([1280], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([2048, 256], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f16), True, 0.1, 1e-05), {})
cnt: 10, ((T([2048, 384], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f16), True, 0.1, 1e-05), {})
cnt: 9, ((T([2048, 768], f16), T([768], f16), T([768], f16), T([768], f16), T([768], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 384], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 2, ((T([128, 384], f16), T([128, 384], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f32), T([384], f32), True, 1e-05, [True, True, True]), {})
cnt: 10, ((T([2048, 384], f16), T([2048, 384], f16), T([384], f16), T([384], f16), T([384], f16), T([384], f32), T([384], f32), True, 1e-05, [True, True, True]), {})
cnt: 9, ((T([2048, 768], f16), T([2048, 768], f16), T([768], f16), T([768], f16), T([768], f16), T([768], f32), T([768], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([2048, 256], f16), T([2048, 256], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([6272, 1280], f16), T([6272, 1280], f16), T([1280], f16), T([1280], f16), T([1280], f16), T([1280], f32), T([1280], f32), True, 1e-05, [True, True, True]), {})
cnt: 10, ((T([6272, 256], f16), T([6272, 256], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 9, ((T([6272, 512], f16), T([6272, 512], f16), T([512], f16), T([512], f16), T([512], f16), T([512], f32), T([512], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([6272, 128], f16), T([6272, 128], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([25088, 640], f16), T([25088, 640], f16), T([640], f16), T([640], f16), T([640], f16), T([640], f32), T([640], f32), True, 1e-05, [True, True, True]), {})
cnt: 8, ((T([25088, 128], f16), T([25088, 128], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 8, ((T([25088, 256], f16), T([25088, 256], f16), T([256], f16), T([256], f16), T([256], f16), T([256], f32), T([256], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 128, 14, 14], f16, stride=(25088, 1, 1792, 128)), T([128, 128, 14, 14], f16), T([128], f16), T([128], f16), T([128], f16), T([128], f32), T([128], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 64, 28, 28], f16), T([128, 64, 28, 28], f16), T([64], f16), T([64], f16), T([64], f16), T([64], f32), T([64], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 32, 56, 56], f16), T([128, 32, 56, 56], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f32), T([16], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.new_empty_strided.default
cnt: 1, ((T([640, 128], f16, stride=(1, 640)), [640, 128], [128, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 8, ((T([256, 128], f16, stride=(1, 256)), [256, 128], [128, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.new_zeros.default
cnt: 4, ((T([12, 16, 16], f16), [12, 16]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([16, 16, 49], f16), [16, 49]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 4, ((T([8, 49, 49], f16), [8, 49]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 1, ((T([8, 49, 196], f16), [8, 196]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
cnt: 4, ((T([4, 196, 196], f16), [4, 196]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.slice_backward.default
cnt: 4, ((T([12, 16], f16), [12, 16], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([16, 49], f16), [16, 49], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 4, 4, 256], f16), [128, 4, 7, 256], 2, 0, 9223372036854775807, 2), {})
cnt: 1, ((T([128, 4, 7, 256], f16), [128, 7, 7, 256], 1, 0, 9223372036854775807, 2), {})
cnt: 1, ((T([128, 7, 7, 256], f16), [128, 7, 7, 256], 0, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([8, 49], f16), [8, 49], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([8, 196], f16), [8, 196], 0, 0, 9223372036854775807, 1), {})
cnt: 1, ((T([128, 7, 7, 128], f16), [128, 7, 14, 128], 2, 0, 9223372036854775807, 2), {})
cnt: 1, ((T([128, 7, 14, 128], f16), [128, 14, 14, 128], 1, 0, 9223372036854775807, 2), {})
cnt: 1, ((T([128, 14, 14, 128], f16), [128, 14, 14, 128], 0, 0, 9223372036854775807, 1), {})
cnt: 4, ((T([4, 196], f16), [4, 196], 0, 0, 9223372036854775807, 1), {})
Operator: aten.split_with_sizes.default
cnt: 4, ((T([128, 196, 4, 64], f16), [16, 16, 32], 3), {})
cnt: 1, ((T([128, 196, 8, 80], f16), [16, 64], 3), {})
cnt: 4, ((T([128, 49, 8, 64], f16), [16, 16, 32], 3), {})
cnt: 1, ((T([128, 49, 16, 80], f16), [16, 64], 3), {})
cnt: 4, ((T([128, 16, 12, 64], f16), [16, 16, 32], 3), {})
Operator: aten.sum.SymInt
cnt: 2, ((T([128, 1000], f16), [0], True), {})
cnt: 4, ((T([128, 12, 16, 16], f16), [0], True), {})
cnt: 1, ((T([128, 16, 16, 49], f16), [0], True), {})
cnt: 4, ((T([128, 8, 49, 49], f16), [0], True), {})
cnt: 1, ((T([128, 8, 49, 196], f16), [0], True), {})
cnt: 1, ((T([128, 128, 640], f16), [0], True), {})
cnt: 8, ((T([128, 128, 256], f16), [0], True), {})
cnt: 4, ((T([128, 4, 196, 196], f16), [0], True), {})

View File

@ -0,0 +1,70 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([64, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16), 1, f16), {})
Operator: aten._unsafe_view.default
cnt: 12, ((T([64, 768, 384], f16), [64, 768, 384]), {})
cnt: 12, ((T([64, 768, 196], f16), [49152, 196]), {})
Operator: aten.add.Tensor
cnt: 12, ((T([64, 768, 384], f16), T([384], f16)), {})
cnt: 12, ((T([64, 196, 768], f16, stride=(150528, 1, 196)), T([64, 196, 768], f16, stride=(150528, 1, 196))), {})
cnt: 12, ((T([64, 196, 768], f16, stride=(150528, 1, 196)), T([64, 196, 768], f16)), {})
cnt: 12, ((T([64, 196, 768], f16), T([64, 196, 768], f16)), {})
cnt: 12, ((T([64, 196, 768], f16), T([64, 196, 768], f16, stride=(150528, 1, 196))), {})
Operator: aten.addmm.default
cnt: 12, ((T([196], f16), T([49152, 384], f16), T([384, 196], f16, stride=(1, 384))), {})
cnt: 12, ((T([3072], f16), T([12544, 768], f16), T([768, 3072], f16, stride=(1, 768))), {})
cnt: 12, ((T([768], f16), T([12544, 3072], f16), T([3072, 768], f16, stride=(1, 3072))), {})
cnt: 1, ((T([1000], f16), T([64, 768], f16), T([768, 1000], f16, stride=(1, 768))), {})
Operator: aten.bmm.default
cnt: 12, ((T([64, 768, 196], f16, stride=(150528, 1, 768)), T([64, 196, 384], f16, stride=(0, 1, 196))), {})
cnt: 12, ((T([64, 196, 768], f16), T([64, 768, 384], f16)), {})
cnt: 12, ((T([64, 768, 384], f16), T([64, 384, 196], f16, stride=(0, 196, 1))), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([768, 3, 16, 16], f16), T([768], f16), [16, 16], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([64, 768, 14, 14], f16, stride=(150528, 1, 10752, 768)), T([64, 3, 224, 224], f16), T([768, 3, 16, 16], f16), [768], [16, 16], [0, 0], [1, 1], False, [0, 0], 1, [False, True, True]), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([64, 3, 224, 224], f16)), {})
cnt: 12, ((T([384, 196], f16), T([384, 196], f16, stride=(1, 384))), {})
Operator: aten.div.Scalar
cnt: 1, ((T([64, 196, 768], f16, stride=(768, 0, 1)), 196), {})
Operator: aten.gelu.default
cnt: 12, ((T([64, 768, 384], f16),), {})
cnt: 12, ((T([64, 196, 3072], f16),), {})
Operator: aten.gelu_backward.default
cnt: 12, ((T([64, 196, 3072], f16), T([64, 196, 3072], f16)), {})
cnt: 12, ((T([64, 768, 384], f16), T([64, 768, 384], f16)), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([64], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([64, 196, 768], f16), [1]), {})
Operator: aten.mm.default
cnt: 1, ((T([64, 1000], f16), T([1000, 768], f16)), {})
cnt: 1, ((T([1000, 64], f16, stride=(1, 1000)), T([64, 768], f16)), {})
cnt: 12, ((T([12544, 768], f16), T([768, 3072], f16)), {})
cnt: 12, ((T([768, 12544], f16, stride=(1, 768)), T([12544, 3072], f16)), {})
cnt: 12, ((T([12544, 3072], f16), T([3072, 768], f16)), {})
cnt: 12, ((T([3072, 12544], f16, stride=(1, 3072)), T([12544, 768], f16)), {})
cnt: 12, ((T([49152, 196], f16), T([196, 384], f16)), {})
cnt: 12, ((T([196, 49152], f16, stride=(1, 196)), T([49152, 384], f16)), {})
Operator: aten.native_layer_norm.default
cnt: 25, ((T([64, 196, 768], f16, stride=(150528, 1, 196)), [768], T([768], f16), T([768], f16), 1e-06), {})
Operator: aten.native_layer_norm_backward.default
cnt: 13, ((T([64, 196, 768], f16), T([64, 196, 768], f16, stride=(150528, 1, 196)), [768], T([64, 196, 1], f32), T([64, 196, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
cnt: 12, ((T([64, 196, 768], f16, stride=(150528, 1, 196)), T([64, 196, 768], f16, stride=(150528, 1, 196)), [768], T([64, 196, 1], f32), T([64, 196, 1], f32), T([768], f16), T([768], f16), [True, True, True]), {})
Operator: aten.new_empty_strided.default
cnt: 12, ((T([384, 196], f16, stride=(1, 384)), [384, 196], [196, 1]), {'dtype': f16, 'layout': torch.strided, 'device': 'cuda'})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([64, 1000], f16), T([64], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([64, 1000], f16), T([64], i64), None, 1, -100), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([64, 1000], f16), [0], True), {})
cnt: 12, ((T([12544, 768], f16), [0], True), {})
cnt: 12, ((T([12544, 3072], f16), [0], True), {})
cnt: 12, ((T([49152, 196], f16), [0], True), {})
cnt: 12, ((T([64, 768, 384], f16), [0, 1], True), {})
cnt: 12, ((T([64, 196, 384], f16), [0], True), {})

View File

@ -0,0 +1,378 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([64, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([64, 1000], f16), T([64, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 58, ((T([], i64), 1), {})
cnt: 2, ((T([64, 32, 112, 112], f16), T([64, 32, 112, 112], f16)), {})
cnt: 2, ((T([64, 40, 56, 56], f16), T([64, 40, 56, 56], f16)), {})
cnt: 6, ((T([64, 56, 28, 28], f16), T([64, 56, 28, 28], f16)), {})
cnt: 6, ((T([64, 104, 14, 14], f16), T([64, 104, 14, 14], f16)), {})
cnt: 6, ((T([64, 160, 14, 14], f16), T([64, 160, 14, 14], f16)), {})
cnt: 6, ((T([64, 264, 7, 7], f16), T([64, 264, 7, 7], f16)), {})
cnt: 3, ((T([64, 1584, 7, 7], f16), T([64, 1584, 7, 7], f16)), {})
cnt: 1, ((T([64, 960, 7, 7], f16), T([64, 960, 7, 7], f16)), {})
cnt: 3, ((T([64, 480, 14, 14], f16), T([64, 480, 14, 14], f16)), {})
cnt: 4, ((T([64, 624, 14, 14], f16), T([64, 624, 14, 14], f16)), {})
cnt: 1, ((T([64, 336, 14, 14], f16), T([64, 336, 14, 14], f16)), {})
cnt: 3, ((T([64, 336, 28, 28], f16), T([64, 336, 28, 28], f16)), {})
cnt: 1, ((T([64, 240, 28, 28], f16), T([64, 240, 28, 28], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([64, 1536], f16), T([1536, 1000], f16, stride=(1, 1536))), {})
Operator: aten.cat.default
cnt: 1, (([T([64, 96, 112, 112], f16), T([64, 96, 112, 112], f16)], 1), {})
cnt: 1, (([T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16), T([64, 64, 56, 56], f16)], 1), {})
cnt: 3, (([T([64, 20, 56, 56], f16), T([64, 20, 56, 56], f16)], 1), {})
cnt: 2, (([T([64, 60, 56, 56], f16), T([64, 60, 56, 56], f16)], 1), {})
cnt: 1, (([T([64, 60, 28, 28], f16), T([64, 60, 28, 28], f16), T([64, 60, 28, 28], f16), T([64, 60, 28, 28], f16)], 1), {})
cnt: 12, (([T([64, 168, 28, 28], f16), T([64, 168, 28, 28], f16)], 1), {})
cnt: 6, (([T([64, 28, 28, 28], f16), T([64, 28, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 112, 14, 14], f16), T([64, 112, 14, 14], f16), T([64, 112, 14, 14], f16)], 1), {})
cnt: 6, (([T([64, 312, 14, 14], f16), T([64, 312, 14, 14], f16)], 1), {})
cnt: 6, (([T([64, 156, 14, 14], f16), T([64, 156, 14, 14], f16), T([64, 156, 14, 14], f16), T([64, 156, 14, 14], f16)], 1), {})
cnt: 6, (([T([64, 52, 14, 14], f16), T([64, 52, 14, 14], f16)], 1), {})
cnt: 6, (([T([64, 240, 14, 14], f16), T([64, 240, 14, 14], f16)], 1), {})
cnt: 6, (([T([64, 120, 14, 14], f16), T([64, 120, 14, 14], f16), T([64, 120, 14, 14], f16), T([64, 120, 14, 14], f16)], 1), {})
cnt: 6, (([T([64, 80, 14, 14], f16), T([64, 80, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 240, 7, 7], f16), T([64, 240, 7, 7], f16), T([64, 240, 7, 7], f16), T([64, 240, 7, 7], f16)], 1), {})
cnt: 6, (([T([64, 396, 7, 7], f16), T([64, 396, 7, 7], f16), T([64, 396, 7, 7], f16), T([64, 396, 7, 7], f16)], 1), {})
cnt: 3, (([T([64, 132, 7, 7], f16), T([64, 132, 7, 7], f16)], 1), {})
cnt: 3, (([T([64, 792, 7, 7], f16), T([64, 792, 7, 7], f16)], 1), {})
cnt: 1, (([T([64, 240, 14, 14], f16), T([64, 240, 14, 14], f16), T([64, 240, 14, 14], f16), T([64, 240, 14, 14], f16)], 1), {})
cnt: 1, (([T([64, 112, 28, 28], f16), T([64, 112, 28, 28], f16), T([64, 112, 28, 28], f16)], 1), {})
cnt: 1, (([T([64, 60, 56, 56], f16), T([64, 60, 56, 56], f16), T([64, 60, 56, 56], f16), T([64, 60, 56, 56], f16)], 1), {})
cnt: 1, (([T([64, 96, 56, 56], f16), T([64, 96, 56, 56], f16)], 1), {})
cnt: 1, (([T([64, 64, 112, 112], f16), T([64, 64, 112, 112], f16), T([64, 64, 112, 112], f16)], 1), {})
cnt: 1, (([T([64, 16, 112, 112], f16), T([64, 16, 112, 112], f16)], 1), {})
Operator: aten.clone.default
cnt: 1, ((T([64, 3, 224, 224], f16),), {})
cnt: 1, ((T([64, 240, 56, 56], f16),), {})
cnt: 1, ((T([64, 240, 28, 28], f16),), {})
cnt: 1, ((T([64, 20, 1, 1], f16),), {})
cnt: 7, ((T([64, 336, 28, 28], f16),), {})
cnt: 3, ((T([64, 28, 1, 1], f16),), {})
cnt: 1, ((T([64, 336, 14, 14], f16),), {})
cnt: 1, ((T([64, 14, 1, 1], f16),), {})
cnt: 8, ((T([64, 624, 14, 14], f16),), {})
cnt: 3, ((T([64, 26, 1, 1], f16),), {})
cnt: 1, ((T([64, 52, 1, 1], f16),), {})
cnt: 6, ((T([64, 480, 14, 14], f16),), {})
cnt: 4, ((T([64, 80, 1, 1], f16),), {})
cnt: 1, ((T([64, 960, 14, 14], f16),), {})
cnt: 1, ((T([64, 960, 7, 7], f16),), {})
cnt: 6, ((T([64, 1584, 7, 7], f16),), {})
cnt: 3, ((T([64, 132, 1, 1], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([32, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([32, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 32), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([32, 32, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 16, 112, 112], f16, stride=(401408, 12544, 112, 1)), T([96, 16, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 64, 112, 112], f16, stride=(2408448, 12544, 112, 1)), T([64, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 64), {})
cnt: 1, ((T([64, 64, 112, 112], f16, stride=(2408448, 12544, 112, 1)), T([64, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 64), {})
cnt: 1, ((T([64, 64, 112, 112], f16, stride=(2408448, 12544, 112, 1)), T([64, 1, 7, 7], f16), None, [2, 2], [3, 3], [1, 1], False, [0, 0], 64), {})
cnt: 2, ((T([64, 96, 56, 56], f16, stride=(602112, 3136, 56, 1)), T([20, 96, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([64, 20, 56, 56], f16, stride=(125440, 3136, 56, 1)), T([60, 20, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 120, 56, 56], f16), T([120, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 120), {})
cnt: 2, ((T([64, 60, 56, 56], f16, stride=(376320, 3136, 56, 1)), T([20, 60, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 40, 56, 56], f16), T([240, 40, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 60, 56, 56], f16, stride=(752640, 3136, 56, 1)), T([60, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 60), {})
cnt: 1, ((T([64, 60, 56, 56], f16, stride=(752640, 3136, 56, 1)), T([60, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 60), {})
cnt: 1, ((T([64, 60, 56, 56], f16, stride=(752640, 3136, 56, 1)), T([60, 1, 7, 7], f16), None, [2, 2], [3, 3], [1, 1], False, [0, 0], 60), {})
cnt: 1, ((T([64, 60, 56, 56], f16, stride=(752640, 3136, 56, 1)), T([60, 1, 9, 9], f16), None, [2, 2], [4, 4], [1, 1], False, [0, 0], 60), {})
cnt: 1, ((T([64, 240, 1, 1], f16), T([20, 240, 1, 1], f16), T([20], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 20, 1, 1], f16), T([240, 20, 1, 1], f16), T([240], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 240, 28, 28], f16), T([56, 240, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 6, ((T([64, 28, 28, 28], f16, stride=(43904, 784, 28, 1)), T([168, 28, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 168, 28, 28], f16, stride=(263424, 784, 28, 1)), T([168, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 168), {})
cnt: 3, ((T([64, 168, 28, 28], f16, stride=(263424, 784, 28, 1)), T([168, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 168), {})
cnt: 3, ((T([64, 336, 1, 1], f16), T([28, 336, 1, 1], f16), T([28], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 28, 1, 1], f16), T([336, 28, 1, 1], f16), T([336], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 6, ((T([64, 168, 28, 28], f16, stride=(263424, 784, 28, 1)), T([28, 168, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 56, 28, 28], f16), T([336, 56, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 112, 28, 28], f16, stride=(263424, 784, 28, 1)), T([112, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 112), {})
cnt: 1, ((T([64, 112, 28, 28], f16, stride=(263424, 784, 28, 1)), T([112, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 112), {})
cnt: 1, ((T([64, 112, 28, 28], f16, stride=(263424, 784, 28, 1)), T([112, 1, 7, 7], f16), None, [2, 2], [3, 3], [1, 1], False, [0, 0], 112), {})
cnt: 1, ((T([64, 336, 1, 1], f16), T([14, 336, 1, 1], f16), T([14], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 14, 1, 1], f16), T([336, 14, 1, 1], f16), T([336], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 336, 14, 14], f16), T([104, 336, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 6, ((T([64, 52, 14, 14], f16, stride=(20384, 196, 14, 1)), T([312, 52, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 156, 14, 14], f16, stride=(122304, 196, 14, 1)), T([156, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 156), {})
cnt: 3, ((T([64, 156, 14, 14], f16, stride=(122304, 196, 14, 1)), T([156, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 156), {})
cnt: 3, ((T([64, 156, 14, 14], f16, stride=(122304, 196, 14, 1)), T([156, 1, 7, 7], f16), None, [1, 1], [3, 3], [1, 1], False, [0, 0], 156), {})
cnt: 3, ((T([64, 156, 14, 14], f16, stride=(122304, 196, 14, 1)), T([156, 1, 9, 9], f16), None, [1, 1], [4, 4], [1, 1], False, [0, 0], 156), {})
cnt: 3, ((T([64, 624, 1, 1], f16), T([26, 624, 1, 1], f16), T([26], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 26, 1, 1], f16), T([624, 26, 1, 1], f16), T([624], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 6, ((T([64, 312, 14, 14], f16, stride=(122304, 196, 14, 1)), T([52, 312, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 104, 14, 14], f16), T([624, 104, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 624, 14, 14], f16), T([624, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 624), {})
cnt: 1, ((T([64, 624, 1, 1], f16), T([52, 624, 1, 1], f16), T([52], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 52, 1, 1], f16), T([624, 52, 1, 1], f16), T([624], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 624, 14, 14], f16), T([160, 624, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 6, ((T([64, 80, 14, 14], f16, stride=(31360, 196, 14, 1)), T([240, 80, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 120, 14, 14], f16, stride=(94080, 196, 14, 1)), T([120, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 120), {})
cnt: 3, ((T([64, 120, 14, 14], f16, stride=(94080, 196, 14, 1)), T([120, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 120), {})
cnt: 3, ((T([64, 120, 14, 14], f16, stride=(94080, 196, 14, 1)), T([120, 1, 7, 7], f16), None, [1, 1], [3, 3], [1, 1], False, [0, 0], 120), {})
cnt: 3, ((T([64, 120, 14, 14], f16, stride=(94080, 196, 14, 1)), T([120, 1, 9, 9], f16), None, [1, 1], [4, 4], [1, 1], False, [0, 0], 120), {})
cnt: 3, ((T([64, 480, 1, 1], f16), T([80, 480, 1, 1], f16), T([80], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 80, 1, 1], f16), T([480, 80, 1, 1], f16), T([480], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 6, ((T([64, 240, 14, 14], f16, stride=(94080, 196, 14, 1)), T([80, 240, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 160, 14, 14], f16), T([960, 160, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 240, 14, 14], f16, stride=(188160, 196, 14, 1)), T([240, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 240), {})
cnt: 1, ((T([64, 240, 14, 14], f16, stride=(188160, 196, 14, 1)), T([240, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 240), {})
cnt: 1, ((T([64, 240, 14, 14], f16, stride=(188160, 196, 14, 1)), T([240, 1, 7, 7], f16), None, [2, 2], [3, 3], [1, 1], False, [0, 0], 240), {})
cnt: 1, ((T([64, 240, 14, 14], f16, stride=(188160, 196, 14, 1)), T([240, 1, 9, 9], f16), None, [2, 2], [4, 4], [1, 1], False, [0, 0], 240), {})
cnt: 1, ((T([64, 960, 1, 1], f16), T([80, 960, 1, 1], f16), T([80], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 80, 1, 1], f16), T([960, 80, 1, 1], f16), T([960], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 960, 7, 7], f16), T([264, 960, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 264, 7, 7], f16), T([1584, 264, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 396, 7, 7], f16, stride=(77616, 49, 7, 1)), T([396, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 396), {})
cnt: 3, ((T([64, 396, 7, 7], f16, stride=(77616, 49, 7, 1)), T([396, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 396), {})
cnt: 3, ((T([64, 396, 7, 7], f16, stride=(77616, 49, 7, 1)), T([396, 1, 7, 7], f16), None, [1, 1], [3, 3], [1, 1], False, [0, 0], 396), {})
cnt: 3, ((T([64, 396, 7, 7], f16, stride=(77616, 49, 7, 1)), T([396, 1, 9, 9], f16), None, [1, 1], [4, 4], [1, 1], False, [0, 0], 396), {})
cnt: 3, ((T([64, 1584, 1, 1], f16), T([132, 1584, 1, 1], f16), T([132], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([64, 132, 1, 1], f16), T([1584, 132, 1, 1], f16), T([1584], f16), [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 6, ((T([64, 792, 7, 7], f16, stride=(77616, 49, 7, 1)), T([132, 792, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([64, 264, 7, 7], f16), T([1536, 264, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([64, 1536, 7, 7], f16), T([64, 264, 7, 7], f16), T([1536, 264, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 6, ((T([64, 132, 7, 7], f16, stride=(12936, 49, 7, 1)), T([64, 792, 7, 7], f16, stride=(77616, 49, 7, 1)), T([132, 792, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([64, 1584, 1, 1], f16), T([64, 132, 1, 1], f16), T([1584, 132, 1, 1], f16), [1584], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([64, 132, 1, 1], f16), T([64, 1584, 1, 1], f16), T([132, 1584, 1, 1], f16), [132], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([64, 396, 7, 7], f16, stride=(77616, 49, 7, 1)), T([64, 396, 7, 7], f16, stride=(77616, 49, 7, 1)), T([396, 1, 9, 9], f16), [0], [1, 1], [4, 4], [1, 1], False, [0, 0], 396, [True, True, False]), {})
cnt: 3, ((T([64, 396, 7, 7], f16, stride=(77616, 49, 7, 1)), T([64, 396, 7, 7], f16, stride=(77616, 49, 7, 1)), T([396, 1, 7, 7], f16), [0], [1, 1], [3, 3], [1, 1], False, [0, 0], 396, [True, True, False]), {})
cnt: 3, ((T([64, 396, 7, 7], f16, stride=(77616, 49, 7, 1)), T([64, 396, 7, 7], f16, stride=(77616, 49, 7, 1)), T([396, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 396, [True, True, False]), {})
cnt: 3, ((T([64, 396, 7, 7], f16, stride=(77616, 49, 7, 1)), T([64, 396, 7, 7], f16, stride=(77616, 49, 7, 1)), T([396, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 396, [True, True, False]), {})
cnt: 3, ((T([64, 1584, 7, 7], f16), T([64, 264, 7, 7], f16), T([1584, 264, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 264, 7, 7], f16), T([64, 960, 7, 7], f16), T([264, 960, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 960, 1, 1], f16), T([64, 80, 1, 1], f16), T([960, 80, 1, 1], f16), [960], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([64, 80, 1, 1], f16), T([64, 960, 1, 1], f16), T([80, 960, 1, 1], f16), [80], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([64, 240, 7, 7], f16, stride=(47040, 49, 7, 1)), T([64, 240, 14, 14], f16, stride=(188160, 196, 14, 1)), T([240, 1, 9, 9], f16), [0], [2, 2], [4, 4], [1, 1], False, [0, 0], 240, [True, True, False]), {})
cnt: 1, ((T([64, 240, 7, 7], f16, stride=(47040, 49, 7, 1)), T([64, 240, 14, 14], f16, stride=(188160, 196, 14, 1)), T([240, 1, 7, 7], f16), [0], [2, 2], [3, 3], [1, 1], False, [0, 0], 240, [True, True, False]), {})
cnt: 1, ((T([64, 240, 7, 7], f16, stride=(47040, 49, 7, 1)), T([64, 240, 14, 14], f16, stride=(188160, 196, 14, 1)), T([240, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 240, [True, True, False]), {})
cnt: 1, ((T([64, 240, 7, 7], f16, stride=(47040, 49, 7, 1)), T([64, 240, 14, 14], f16, stride=(188160, 196, 14, 1)), T([240, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 240, [True, True, False]), {})
cnt: 1, ((T([64, 960, 14, 14], f16), T([64, 160, 14, 14], f16), T([960, 160, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 6, ((T([64, 80, 14, 14], f16, stride=(31360, 196, 14, 1)), T([64, 240, 14, 14], f16, stride=(94080, 196, 14, 1)), T([80, 240, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([64, 480, 1, 1], f16), T([64, 80, 1, 1], f16), T([480, 80, 1, 1], f16), [480], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([64, 80, 1, 1], f16), T([64, 480, 1, 1], f16), T([80, 480, 1, 1], f16), [80], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([64, 120, 14, 14], f16, stride=(94080, 196, 14, 1)), T([64, 120, 14, 14], f16, stride=(94080, 196, 14, 1)), T([120, 1, 9, 9], f16), [0], [1, 1], [4, 4], [1, 1], False, [0, 0], 120, [True, True, False]), {})
cnt: 3, ((T([64, 120, 14, 14], f16, stride=(94080, 196, 14, 1)), T([64, 120, 14, 14], f16, stride=(94080, 196, 14, 1)), T([120, 1, 7, 7], f16), [0], [1, 1], [3, 3], [1, 1], False, [0, 0], 120, [True, True, False]), {})
cnt: 3, ((T([64, 120, 14, 14], f16, stride=(94080, 196, 14, 1)), T([64, 120, 14, 14], f16, stride=(94080, 196, 14, 1)), T([120, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 120, [True, True, False]), {})
cnt: 3, ((T([64, 120, 14, 14], f16, stride=(94080, 196, 14, 1)), T([64, 120, 14, 14], f16, stride=(94080, 196, 14, 1)), T([120, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 120, [True, True, False]), {})
cnt: 6, ((T([64, 240, 14, 14], f16, stride=(94080, 196, 14, 1)), T([64, 80, 14, 14], f16, stride=(31360, 196, 14, 1)), T([240, 80, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 160, 14, 14], f16), T([64, 624, 14, 14], f16), T([160, 624, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 624, 1, 1], f16), T([64, 52, 1, 1], f16), T([624, 52, 1, 1], f16), [624], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([64, 52, 1, 1], f16), T([64, 624, 1, 1], f16), T([52, 624, 1, 1], f16), [52], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([64, 624, 14, 14], f16), T([64, 624, 14, 14], f16), T([624, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 624, [True, True, False]), {})
cnt: 1, ((T([64, 624, 14, 14], f16), T([64, 104, 14, 14], f16), T([624, 104, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 6, ((T([64, 52, 14, 14], f16, stride=(20384, 196, 14, 1)), T([64, 312, 14, 14], f16, stride=(122304, 196, 14, 1)), T([52, 312, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([64, 624, 1, 1], f16), T([64, 26, 1, 1], f16), T([624, 26, 1, 1], f16), [624], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([64, 26, 1, 1], f16), T([64, 624, 1, 1], f16), T([26, 624, 1, 1], f16), [26], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([64, 156, 14, 14], f16, stride=(122304, 196, 14, 1)), T([64, 156, 14, 14], f16, stride=(122304, 196, 14, 1)), T([156, 1, 9, 9], f16), [0], [1, 1], [4, 4], [1, 1], False, [0, 0], 156, [True, True, False]), {})
cnt: 3, ((T([64, 156, 14, 14], f16, stride=(122304, 196, 14, 1)), T([64, 156, 14, 14], f16, stride=(122304, 196, 14, 1)), T([156, 1, 7, 7], f16), [0], [1, 1], [3, 3], [1, 1], False, [0, 0], 156, [True, True, False]), {})
cnt: 3, ((T([64, 156, 14, 14], f16, stride=(122304, 196, 14, 1)), T([64, 156, 14, 14], f16, stride=(122304, 196, 14, 1)), T([156, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 156, [True, True, False]), {})
cnt: 3, ((T([64, 156, 14, 14], f16, stride=(122304, 196, 14, 1)), T([64, 156, 14, 14], f16, stride=(122304, 196, 14, 1)), T([156, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 156, [True, True, False]), {})
cnt: 6, ((T([64, 312, 14, 14], f16, stride=(122304, 196, 14, 1)), T([64, 52, 14, 14], f16, stride=(20384, 196, 14, 1)), T([312, 52, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 104, 14, 14], f16), T([64, 336, 14, 14], f16), T([104, 336, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 336, 1, 1], f16), T([64, 14, 1, 1], f16), T([336, 14, 1, 1], f16), [336], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([64, 14, 1, 1], f16), T([64, 336, 1, 1], f16), T([14, 336, 1, 1], f16), [14], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([64, 112, 14, 14], f16, stride=(65856, 196, 14, 1)), T([64, 112, 28, 28], f16, stride=(263424, 784, 28, 1)), T([112, 1, 7, 7], f16), [0], [2, 2], [3, 3], [1, 1], False, [0, 0], 112, [True, True, False]), {})
cnt: 1, ((T([64, 112, 14, 14], f16, stride=(65856, 196, 14, 1)), T([64, 112, 28, 28], f16, stride=(263424, 784, 28, 1)), T([112, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 112, [True, True, False]), {})
cnt: 1, ((T([64, 112, 14, 14], f16, stride=(65856, 196, 14, 1)), T([64, 112, 28, 28], f16, stride=(263424, 784, 28, 1)), T([112, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 112, [True, True, False]), {})
cnt: 1, ((T([64, 336, 28, 28], f16), T([64, 56, 28, 28], f16), T([336, 56, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 6, ((T([64, 28, 28, 28], f16, stride=(43904, 784, 28, 1)), T([64, 168, 28, 28], f16, stride=(263424, 784, 28, 1)), T([28, 168, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([64, 336, 1, 1], f16), T([64, 28, 1, 1], f16), T([336, 28, 1, 1], f16), [336], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([64, 28, 1, 1], f16), T([64, 336, 1, 1], f16), T([28, 336, 1, 1], f16), [28], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 3, ((T([64, 168, 28, 28], f16, stride=(263424, 784, 28, 1)), T([64, 168, 28, 28], f16, stride=(263424, 784, 28, 1)), T([168, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 168, [True, True, False]), {})
cnt: 3, ((T([64, 168, 28, 28], f16, stride=(263424, 784, 28, 1)), T([64, 168, 28, 28], f16, stride=(263424, 784, 28, 1)), T([168, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 168, [True, True, False]), {})
cnt: 6, ((T([64, 168, 28, 28], f16, stride=(263424, 784, 28, 1)), T([64, 28, 28, 28], f16, stride=(43904, 784, 28, 1)), T([168, 28, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 56, 28, 28], f16), T([64, 240, 28, 28], f16), T([56, 240, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 240, 1, 1], f16), T([64, 20, 1, 1], f16), T([240, 20, 1, 1], f16), [240], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([64, 20, 1, 1], f16), T([64, 240, 1, 1], f16), T([20, 240, 1, 1], f16), [20], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, True]), {})
cnt: 1, ((T([64, 60, 28, 28], f16, stride=(188160, 784, 28, 1)), T([64, 60, 56, 56], f16, stride=(752640, 3136, 56, 1)), T([60, 1, 9, 9], f16), [0], [2, 2], [4, 4], [1, 1], False, [0, 0], 60, [True, True, False]), {})
cnt: 1, ((T([64, 60, 28, 28], f16, stride=(188160, 784, 28, 1)), T([64, 60, 56, 56], f16, stride=(752640, 3136, 56, 1)), T([60, 1, 7, 7], f16), [0], [2, 2], [3, 3], [1, 1], False, [0, 0], 60, [True, True, False]), {})
cnt: 1, ((T([64, 60, 28, 28], f16, stride=(188160, 784, 28, 1)), T([64, 60, 56, 56], f16, stride=(752640, 3136, 56, 1)), T([60, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 60, [True, True, False]), {})
cnt: 1, ((T([64, 60, 28, 28], f16, stride=(188160, 784, 28, 1)), T([64, 60, 56, 56], f16, stride=(752640, 3136, 56, 1)), T([60, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 60, [True, True, False]), {})
cnt: 1, ((T([64, 240, 56, 56], f16), T([64, 40, 56, 56], f16), T([240, 40, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 20, 56, 56], f16, stride=(125440, 3136, 56, 1)), T([64, 60, 56, 56], f16, stride=(376320, 3136, 56, 1)), T([20, 60, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 120, 56, 56], f16), T([64, 120, 56, 56], f16), T([120, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 120, [True, True, False]), {})
cnt: 2, ((T([64, 60, 56, 56], f16, stride=(376320, 3136, 56, 1)), T([64, 20, 56, 56], f16, stride=(125440, 3136, 56, 1)), T([60, 20, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([64, 20, 56, 56], f16, stride=(125440, 3136, 56, 1)), T([64, 96, 56, 56], f16, stride=(602112, 3136, 56, 1)), T([20, 96, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 64, 56, 56], f16, stride=(602112, 3136, 56, 1)), T([64, 64, 112, 112], f16, stride=(2408448, 12544, 112, 1)), T([64, 1, 7, 7], f16), [0], [2, 2], [3, 3], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 1, ((T([64, 64, 56, 56], f16, stride=(602112, 3136, 56, 1)), T([64, 64, 112, 112], f16, stride=(2408448, 12544, 112, 1)), T([64, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 1, ((T([64, 64, 56, 56], f16, stride=(602112, 3136, 56, 1)), T([64, 64, 112, 112], f16, stride=(2408448, 12544, 112, 1)), T([64, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 64, [True, True, False]), {})
cnt: 2, ((T([64, 96, 112, 112], f16, stride=(2408448, 12544, 112, 1)), T([64, 16, 112, 112], f16, stride=(401408, 12544, 112, 1)), T([96, 16, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([64, 32, 112, 112], f16), T([32, 32, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([64, 32, 112, 112], f16), T([32, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 32, [True, True, False]), {})
cnt: 1, ((T([64, 32, 112, 112], f16), T([64, 3, 224, 224], f16), T([32, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([64, 3, 224, 224], f16), T([64, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([64, 1536, 7, 7], f16, stride=(1536, 1, 0, 0)), 49), {})
cnt: 3, ((T([64, 1584, 7, 7], f16, stride=(1584, 1, 0, 0)), 49), {})
cnt: 1, ((T([64, 960, 7, 7], f16, stride=(960, 1, 0, 0)), 49), {})
cnt: 3, ((T([64, 480, 14, 14], f16, stride=(480, 1, 0, 0)), 196), {})
cnt: 4, ((T([64, 624, 14, 14], f16, stride=(624, 1, 0, 0)), 196), {})
cnt: 1, ((T([64, 336, 14, 14], f16, stride=(336, 1, 0, 0)), 196), {})
cnt: 3, ((T([64, 336, 28, 28], f16, stride=(336, 1, 0, 0)), 784), {})
cnt: 1, ((T([64, 240, 28, 28], f16, stride=(240, 1, 0, 0)), 784), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([64], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([64, 240, 28, 28], f16), [2, 3], True), {})
cnt: 3, ((T([64, 336, 28, 28], f16), [2, 3], True), {})
cnt: 1, ((T([64, 336, 14, 14], f16), [2, 3], True), {})
cnt: 4, ((T([64, 624, 14, 14], f16), [2, 3], True), {})
cnt: 3, ((T([64, 480, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([64, 960, 7, 7], f16), [2, 3], True), {})
cnt: 3, ((T([64, 1584, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([64, 1536, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([64, 1000], f16), T([1000, 1536], f16)), {})
cnt: 1, ((T([1000, 64], f16, stride=(1, 1000)), T([64, 1536], f16)), {})
Operator: aten.mul.Tensor
cnt: 2, ((T([64, 240, 28, 28], f16), T([64, 240, 1, 1], f16)), {})
cnt: 6, ((T([64, 336, 28, 28], f16), T([64, 336, 1, 1], f16)), {})
cnt: 2, ((T([64, 336, 14, 14], f16), T([64, 336, 1, 1], f16)), {})
cnt: 8, ((T([64, 624, 14, 14], f16), T([64, 624, 1, 1], f16)), {})
cnt: 6, ((T([64, 480, 14, 14], f16), T([64, 480, 1, 1], f16)), {})
cnt: 2, ((T([64, 960, 7, 7], f16), T([64, 960, 1, 1], f16)), {})
cnt: 6, ((T([64, 1584, 7, 7], f16), T([64, 1584, 1, 1], f16)), {})
cnt: 3, ((T([64, 1584, 7, 7], f16), T([64, 1584, 7, 7], f16)), {})
cnt: 1, ((T([64, 960, 7, 7], f16), T([64, 960, 7, 7], f16)), {})
cnt: 3, ((T([64, 480, 14, 14], f16), T([64, 480, 14, 14], f16)), {})
cnt: 4, ((T([64, 624, 14, 14], f16), T([64, 624, 14, 14], f16)), {})
cnt: 1, ((T([64, 336, 14, 14], f16), T([64, 336, 14, 14], f16)), {})
cnt: 3, ((T([64, 336, 28, 28], f16), T([64, 336, 28, 28], f16)), {})
cnt: 1, ((T([64, 240, 28, 28], f16), T([64, 240, 28, 28], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 3, ((T([64, 32, 112, 112], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 192, 112, 112], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 192, 56, 56], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([64, 40, 56, 56], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([64, 120, 56, 56], f16), T([120], f16), T([120], f16), T([120], f16), T([120], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 240, 56, 56], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 240, 28, 28], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([64, 56, 28, 28], f16), T([56], f16), T([56], f16), T([56], f16), T([56], f16), True, 0.1, 1e-05), {})
cnt: 7, ((T([64, 336, 28, 28], f16), T([336], f16), T([336], f16), T([336], f16), T([336], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 336, 14, 14], f16), T([336], f16), T([336], f16), T([336], f16), T([336], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([64, 104, 14, 14], f16), T([104], f16), T([104], f16), T([104], f16), T([104], f16), True, 0.1, 1e-05), {})
cnt: 8, ((T([64, 624, 14, 14], f16), T([624], f16), T([624], f16), T([624], f16), T([624], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([64, 160, 14, 14], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f16), True, 0.1, 1e-05), {})
cnt: 6, ((T([64, 480, 14, 14], f16), T([480], f16), T([480], f16), T([480], f16), T([480], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 960, 14, 14], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 960, 7, 7], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([64, 264, 7, 7], f16), T([264], f16), T([264], f16), T([264], f16), T([264], f16), True, 0.1, 1e-05), {})
cnt: 6, ((T([64, 1584, 7, 7], f16), T([1584], f16), T([1584], f16), T([1584], f16), T([1584], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([64, 1536, 7, 7], f16), T([1536], f16), T([1536], f16), T([1536], f16), T([1536], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([64, 1536, 7, 7], f16), T([64, 1536, 7, 7], f16), T([1536], f16), T([1536], f16), T([1536], f16), T([1536], f32), T([1536], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([64, 264, 7, 7], f16), T([64, 264, 7, 7], f16), T([264], f16), T([264], f16), T([264], f16), T([264], f32), T([264], f32), True, 1e-05, [True, True, True]), {})
cnt: 6, ((T([64, 1584, 7, 7], f16), T([64, 1584, 7, 7], f16), T([1584], f16), T([1584], f16), T([1584], f16), T([1584], f32), T([1584], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 960, 7, 7], f16), T([64, 960, 7, 7], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f32), T([960], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 960, 14, 14], f16), T([64, 960, 14, 14], f16), T([960], f16), T([960], f16), T([960], f16), T([960], f32), T([960], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([64, 160, 14, 14], f16), T([64, 160, 14, 14], f16), T([160], f16), T([160], f16), T([160], f16), T([160], f32), T([160], f32), True, 1e-05, [True, True, True]), {})
cnt: 6, ((T([64, 480, 14, 14], f16), T([64, 480, 14, 14], f16), T([480], f16), T([480], f16), T([480], f16), T([480], f32), T([480], f32), True, 1e-05, [True, True, True]), {})
cnt: 8, ((T([64, 624, 14, 14], f16), T([64, 624, 14, 14], f16), T([624], f16), T([624], f16), T([624], f16), T([624], f32), T([624], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([64, 104, 14, 14], f16), T([64, 104, 14, 14], f16), T([104], f16), T([104], f16), T([104], f16), T([104], f32), T([104], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 336, 14, 14], f16), T([64, 336, 14, 14], f16), T([336], f16), T([336], f16), T([336], f16), T([336], f32), T([336], f32), True, 1e-05, [True, True, True]), {})
cnt: 7, ((T([64, 336, 28, 28], f16), T([64, 336, 28, 28], f16), T([336], f16), T([336], f16), T([336], f16), T([336], f32), T([336], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([64, 56, 28, 28], f16), T([64, 56, 28, 28], f16), T([56], f16), T([56], f16), T([56], f16), T([56], f32), T([56], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 240, 28, 28], f16), T([64, 240, 28, 28], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f32), T([240], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 240, 56, 56], f16), T([64, 240, 56, 56], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f32), T([240], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([64, 40, 56, 56], f16), T([64, 40, 56, 56], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f32), T([40], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([64, 120, 56, 56], f16), T([64, 120, 56, 56], f16), T([120], f16), T([120], f16), T([120], f16), T([120], f32), T([120], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 192, 56, 56], f16), T([64, 192, 56, 56], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([64, 192, 112, 112], f16), T([64, 192, 112, 112], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([64, 32, 112, 112], f16), T([64, 32, 112, 112], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([64, 1000], f16), T([64], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([64, 1000], f16), T([64], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 2, ((T([64, 32, 112, 112], f16),), {})
cnt: 1, ((T([64, 192, 112, 112], f16),), {})
cnt: 1, ((T([64, 192, 56, 56], f16),), {})
cnt: 2, ((T([64, 120, 56, 56], f16),), {})
cnt: 1, ((T([64, 1536, 7, 7], f16),), {})
Operator: aten.sigmoid.default
cnt: 1, ((T([64, 240, 1, 1], f16),), {})
cnt: 4, ((T([64, 336, 1, 1], f16),), {})
cnt: 4, ((T([64, 624, 1, 1], f16),), {})
cnt: 3, ((T([64, 480, 1, 1], f16),), {})
cnt: 1, ((T([64, 960, 1, 1], f16),), {})
cnt: 3, ((T([64, 1584, 1, 1], f16),), {})
Operator: aten.sigmoid_backward.default
cnt: 3, ((T([64, 1584, 1, 1], f16), T([64, 1584, 1, 1], f16)), {})
cnt: 1, ((T([64, 960, 1, 1], f16), T([64, 960, 1, 1], f16)), {})
cnt: 3, ((T([64, 480, 1, 1], f16), T([64, 480, 1, 1], f16)), {})
cnt: 4, ((T([64, 624, 1, 1], f16), T([64, 624, 1, 1], f16)), {})
cnt: 4, ((T([64, 336, 1, 1], f16), T([64, 336, 1, 1], f16)), {})
cnt: 1, ((T([64, 240, 1, 1], f16), T([64, 240, 1, 1], f16)), {})
Operator: aten.silu_.default
cnt: 1, ((T([64, 240, 56, 56], f16),), {})
cnt: 1, ((T([64, 240, 28, 28], f16),), {})
cnt: 1, ((T([64, 20, 1, 1], f16),), {})
cnt: 7, ((T([64, 336, 28, 28], f16),), {})
cnt: 3, ((T([64, 28, 1, 1], f16),), {})
cnt: 1, ((T([64, 336, 14, 14], f16),), {})
cnt: 1, ((T([64, 14, 1, 1], f16),), {})
cnt: 8, ((T([64, 624, 14, 14], f16),), {})
cnt: 3, ((T([64, 26, 1, 1], f16),), {})
cnt: 1, ((T([64, 52, 1, 1], f16),), {})
cnt: 6, ((T([64, 480, 14, 14], f16),), {})
cnt: 4, ((T([64, 80, 1, 1], f16),), {})
cnt: 1, ((T([64, 960, 14, 14], f16),), {})
cnt: 1, ((T([64, 960, 7, 7], f16),), {})
cnt: 6, ((T([64, 1584, 7, 7], f16),), {})
cnt: 3, ((T([64, 132, 1, 1], f16),), {})
Operator: aten.silu_backward.default
cnt: 3, ((T([64, 132, 1, 1], f16), T([64, 132, 1, 1], f16)), {})
cnt: 6, ((T([64, 1584, 7, 7], f16), T([64, 1584, 7, 7], f16)), {})
cnt: 4, ((T([64, 80, 1, 1], f16), T([64, 80, 1, 1], f16)), {})
cnt: 1, ((T([64, 960, 7, 7], f16), T([64, 960, 7, 7], f16)), {})
cnt: 1, ((T([64, 960, 14, 14], f16), T([64, 960, 14, 14], f16)), {})
cnt: 6, ((T([64, 480, 14, 14], f16), T([64, 480, 14, 14], f16)), {})
cnt: 1, ((T([64, 52, 1, 1], f16), T([64, 52, 1, 1], f16)), {})
cnt: 8, ((T([64, 624, 14, 14], f16), T([64, 624, 14, 14], f16)), {})
cnt: 3, ((T([64, 26, 1, 1], f16), T([64, 26, 1, 1], f16)), {})
cnt: 1, ((T([64, 14, 1, 1], f16), T([64, 14, 1, 1], f16)), {})
cnt: 1, ((T([64, 336, 14, 14], f16), T([64, 336, 14, 14], f16)), {})
cnt: 7, ((T([64, 336, 28, 28], f16), T([64, 336, 28, 28], f16)), {})
cnt: 3, ((T([64, 28, 1, 1], f16), T([64, 28, 1, 1], f16)), {})
cnt: 1, ((T([64, 20, 1, 1], f16), T([64, 20, 1, 1], f16)), {})
cnt: 1, ((T([64, 240, 28, 28], f16), T([64, 240, 28, 28], f16)), {})
cnt: 1, ((T([64, 240, 56, 56], f16), T([64, 240, 56, 56], f16)), {})
Operator: aten.split_with_sizes.default
cnt: 1, ((T([64, 32, 112, 112], f16), [16, 16], 1), {})
cnt: 1, ((T([64, 192, 112, 112], f16), [64, 64, 64], 1), {})
cnt: 1, ((T([64, 192, 56, 56], f16), [96, 96], 1), {})
cnt: 1, ((T([64, 40, 56, 56], f16), [20, 20], 1), {})
cnt: 1, ((T([64, 120, 56, 56], f16), [60, 60], 1), {})
cnt: 1, ((T([64, 240, 56, 56], f16), [60, 60, 60, 60], 1), {})
cnt: 3, ((T([64, 56, 28, 28], f16), [28, 28], 1), {})
cnt: 6, ((T([64, 336, 28, 28], f16), [168, 168], 1), {})
cnt: 1, ((T([64, 336, 28, 28], f16), [112, 112, 112], 1), {})
cnt: 3, ((T([64, 104, 14, 14], f16), [52, 52], 1), {})
cnt: 3, ((T([64, 624, 14, 14], f16), [156, 156, 156, 156], 1), {})
cnt: 3, ((T([64, 624, 14, 14], f16), [312, 312], 1), {})
cnt: 3, ((T([64, 160, 14, 14], f16), [80, 80], 1), {})
cnt: 3, ((T([64, 480, 14, 14], f16), [120, 120, 120, 120], 1), {})
cnt: 3, ((T([64, 480, 14, 14], f16), [240, 240], 1), {})
cnt: 1, ((T([64, 960, 14, 14], f16), [240, 240, 240, 240], 1), {})
cnt: 3, ((T([64, 1584, 7, 7], f16), [396, 396, 396, 396], 1), {})
cnt: 3, ((T([64, 1584, 7, 7], f16), [792, 792], 1), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([64, 1000], f16), [0], True), {})
cnt: 3, ((T([64, 1584, 7, 7], f16), [2, 3], True), {})
cnt: 1, ((T([64, 960, 7, 7], f16), [2, 3], True), {})
cnt: 3, ((T([64, 480, 14, 14], f16), [2, 3], True), {})
cnt: 4, ((T([64, 624, 14, 14], f16), [2, 3], True), {})
cnt: 1, ((T([64, 336, 14, 14], f16), [2, 3], True), {})
cnt: 3, ((T([64, 336, 28, 28], f16), [2, 3], True), {})
cnt: 1, ((T([64, 240, 28, 28], f16), [2, 3], True), {})
Operator: aten.threshold_backward.default
cnt: 1, ((T([64, 1536, 7, 7], f16), T([64, 1536, 7, 7], f16), 0), {})
cnt: 2, ((T([64, 120, 56, 56], f16), T([64, 120, 56, 56], f16), 0), {})
cnt: 1, ((T([64, 192, 56, 56], f16), T([64, 192, 56, 56], f16), 0), {})
cnt: 1, ((T([64, 192, 112, 112], f16), T([64, 192, 112, 112], f16), 0), {})
cnt: 2, ((T([64, 32, 112, 112], f16), T([64, 32, 112, 112], f16), 0), {})

View File

@ -0,0 +1,170 @@
Operator: aten._log_softmax.default
cnt: 1, ((T([128, 1000], f16), 1, False), {})
Operator: aten._log_softmax_backward_data.default
cnt: 1, ((T([128, 1000], f16), T([128, 1000], f16), 1, f16), {})
Operator: aten.add.Tensor
cnt: 52, ((T([], i64), 1), {})
cnt: 4, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16)), {})
cnt: 4, ((T([128, 40, 28, 28], f16), T([128, 40, 28, 28], f16)), {})
cnt: 4, ((T([128, 80, 14, 14], f16), T([128, 80, 14, 14], f16)), {})
cnt: 2, ((T([128, 96, 14, 14], f16), T([128, 96, 14, 14], f16)), {})
cnt: 6, ((T([128, 192, 7, 7], f16), T([128, 192, 7, 7], f16)), {})
Operator: aten.addmm.default
cnt: 1, ((T([1000], f16), T([128, 1280], f16), T([1280, 1000], f16, stride=(1, 1280))), {})
Operator: aten.clone.default
cnt: 1, ((T([128, 3, 224, 224], f16),), {})
Operator: aten.convolution.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([32, 3, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 32, 112, 112], f16), T([32, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 32), {})
cnt: 1, ((T([128, 32, 112, 112], f16), T([16, 32, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([48, 16, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 48, 112, 112], f16), T([48, 1, 3, 3], f16), None, [2, 2], [1, 1], [1, 1], False, [0, 0], 48), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([24, 48, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 24, 56, 56], f16), T([72, 24, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 72, 56, 56], f16), T([72, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 72), {})
cnt: 2, ((T([128, 72, 56, 56], f16), T([24, 72, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 72, 56, 56], f16), T([72, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 72), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([40, 72, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 40, 28, 28], f16), T([120, 40, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 120, 28, 28], f16), T([120, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 120), {})
cnt: 2, ((T([128, 120, 28, 28], f16), T([40, 120, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 40, 28, 28], f16), T([240, 40, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([240, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 240), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([80, 240, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 80, 14, 14], f16), T([480, 80, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 480, 14, 14], f16), T([480, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 480), {})
cnt: 2, ((T([128, 480, 14, 14], f16), T([80, 480, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 480, 14, 14], f16), T([480, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 480), {})
cnt: 1, ((T([128, 480, 14, 14], f16), T([96, 480, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 2, ((T([128, 96, 14, 14], f16), T([576, 96, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 576, 14, 14], f16), T([576, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 576), {})
cnt: 1, ((T([128, 576, 14, 14], f16), T([96, 576, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 576, 14, 14], f16), T([576, 1, 5, 5], f16), None, [2, 2], [2, 2], [1, 1], False, [0, 0], 576), {})
cnt: 1, ((T([128, 576, 7, 7], f16), T([192, 576, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 4, ((T([128, 192, 7, 7], f16), T([1152, 192, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 3, ((T([128, 1152, 7, 7], f16), T([1152, 1, 5, 5], f16), None, [1, 1], [2, 2], [1, 1], False, [0, 0], 1152), {})
cnt: 3, ((T([128, 1152, 7, 7], f16), T([192, 1152, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 1152, 7, 7], f16), T([1152, 1, 3, 3], f16), None, [1, 1], [1, 1], [1, 1], False, [0, 0], 1152), {})
cnt: 1, ((T([128, 1152, 7, 7], f16), T([320, 1152, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
cnt: 1, ((T([128, 320, 7, 7], f16), T([1280, 320, 1, 1], f16), None, [1, 1], [0, 0], [1, 1], False, [0, 0], 1), {})
Operator: aten.convolution_backward.default
cnt: 1, ((T([128, 1280, 7, 7], f16), T([128, 320, 7, 7], f16), T([1280, 320, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 320, 7, 7], f16), T([128, 1152, 7, 7], f16), T([320, 1152, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 1152, 7, 7], f16), T([128, 1152, 7, 7], f16), T([1152, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 1152, [True, True, False]), {})
cnt: 4, ((T([128, 1152, 7, 7], f16), T([128, 192, 7, 7], f16), T([1152, 192, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 192, 7, 7], f16), T([128, 1152, 7, 7], f16), T([192, 1152, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 3, ((T([128, 1152, 7, 7], f16), T([128, 1152, 7, 7], f16), T([1152, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 1152, [True, True, False]), {})
cnt: 1, ((T([128, 192, 7, 7], f16), T([128, 576, 7, 7], f16), T([192, 576, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 576, 7, 7], f16), T([128, 576, 14, 14], f16), T([576, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 576, [True, True, False]), {})
cnt: 2, ((T([128, 576, 14, 14], f16), T([128, 96, 14, 14], f16), T([576, 96, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 96, 14, 14], f16), T([128, 576, 14, 14], f16), T([96, 576, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 576, 14, 14], f16), T([128, 576, 14, 14], f16), T([576, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 576, [True, True, False]), {})
cnt: 1, ((T([128, 96, 14, 14], f16), T([128, 480, 14, 14], f16), T([96, 480, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 480, 14, 14], f16), T([128, 480, 14, 14], f16), T([480, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 480, [True, True, False]), {})
cnt: 3, ((T([128, 480, 14, 14], f16), T([128, 80, 14, 14], f16), T([480, 80, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 80, 14, 14], f16), T([128, 480, 14, 14], f16), T([80, 480, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 480, 14, 14], f16), T([128, 480, 14, 14], f16), T([480, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 480, [True, True, False]), {})
cnt: 1, ((T([128, 80, 14, 14], f16), T([128, 240, 14, 14], f16), T([80, 240, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([128, 240, 28, 28], f16), T([240, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 240, [True, True, False]), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([128, 40, 28, 28], f16), T([240, 40, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 40, 28, 28], f16), T([128, 120, 28, 28], f16), T([40, 120, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16), T([120, 1, 5, 5], f16), [0], [1, 1], [2, 2], [1, 1], False, [0, 0], 120, [True, True, False]), {})
cnt: 2, ((T([128, 120, 28, 28], f16), T([128, 40, 28, 28], f16), T([120, 40, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 40, 28, 28], f16), T([128, 72, 28, 28], f16), T([40, 72, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([128, 72, 56, 56], f16), T([72, 1, 5, 5], f16), [0], [2, 2], [2, 2], [1, 1], False, [0, 0], 72, [True, True, False]), {})
cnt: 3, ((T([128, 72, 56, 56], f16), T([128, 24, 56, 56], f16), T([72, 24, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 24, 56, 56], f16), T([128, 72, 56, 56], f16), T([24, 72, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 2, ((T([128, 72, 56, 56], f16), T([128, 72, 56, 56], f16), T([72, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 72, [True, True, False]), {})
cnt: 1, ((T([128, 24, 56, 56], f16), T([128, 48, 56, 56], f16), T([24, 48, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([128, 48, 112, 112], f16), T([48, 1, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 48, [True, True, False]), {})
cnt: 1, ((T([128, 48, 112, 112], f16), T([128, 16, 112, 112], f16), T([48, 16, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 32, 112, 112], f16), T([16, 32, 1, 1], f16), [0], [1, 1], [0, 0], [1, 1], False, [0, 0], 1, [True, True, False]), {})
cnt: 1, ((T([128, 32, 112, 112], f16), T([128, 32, 112, 112], f16), T([32, 1, 3, 3], f16), [0], [1, 1], [1, 1], [1, 1], False, [0, 0], 32, [True, True, False]), {})
cnt: 1, ((T([128, 32, 112, 112], f16), T([128, 3, 224, 224], f16), T([32, 3, 3, 3], f16), [0], [2, 2], [1, 1], [1, 1], False, [0, 0], 1, [False, True, False]), {})
Operator: aten.copy_.default
cnt: 1, ((T([128, 3, 224, 224], f16), T([128, 3, 224, 224], f16)), {})
Operator: aten.div.Scalar
cnt: 1, ((T([128, 1280, 7, 7], f16, stride=(1280, 1, 0, 0)), 49), {})
Operator: aten.lift_fresh_copy.default
cnt: 1, ((T([128], i64),), {})
Operator: aten.mean.dim
cnt: 1, ((T([128, 1280, 7, 7], f16), [-1, -2], True), {})
Operator: aten.mm.default
cnt: 1, ((T([128, 1000], f16), T([1000, 1280], f16)), {})
cnt: 1, ((T([1000, 128], f16, stride=(1, 1000)), T([128, 1280], f16)), {})
Operator: aten.native_batch_norm.default
cnt: 2, ((T([128, 32, 112, 112], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 48, 112, 112], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 24, 56, 56], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f16), True, 0.1, 1e-05), {})
cnt: 5, ((T([128, 72, 56, 56], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 40, 28, 28], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 120, 28, 28], f16), T([120], f16), T([120], f16), T([120], f16), T([120], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 80, 14, 14], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f16), True, 0.1, 1e-05), {})
cnt: 6, ((T([128, 480, 14, 14], f16), T([480], f16), T([480], f16), T([480], f16), T([480], f16), True, 0.1, 1e-05), {})
cnt: 2, ((T([128, 96, 14, 14], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f16), True, 0.1, 1e-05), {})
cnt: 3, ((T([128, 576, 14, 14], f16), T([576], f16), T([576], f16), T([576], f16), T([576], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 576, 7, 7], f16), T([576], f16), T([576], f16), T([576], f16), T([576], f16), True, 0.1, 1e-05), {})
cnt: 4, ((T([128, 192, 7, 7], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f16), True, 0.1, 1e-05), {})
cnt: 8, ((T([128, 1152, 7, 7], f16), T([1152], f16), T([1152], f16), T([1152], f16), T([1152], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 320, 7, 7], f16), T([320], f16), T([320], f16), T([320], f16), T([320], f16), True, 0.1, 1e-05), {})
cnt: 1, ((T([128, 1280, 7, 7], f16), T([1280], f16), T([1280], f16), T([1280], f16), T([1280], f16), True, 0.1, 1e-05), {})
Operator: aten.native_batch_norm_backward.default
cnt: 1, ((T([128, 1280, 7, 7], f16), T([128, 1280, 7, 7], f16), T([1280], f16), T([1280], f16), T([1280], f16), T([1280], f32), T([1280], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 320, 7, 7], f16), T([128, 320, 7, 7], f16), T([320], f16), T([320], f16), T([320], f16), T([320], f32), T([320], f32), True, 1e-05, [True, True, True]), {})
cnt: 8, ((T([128, 1152, 7, 7], f16), T([128, 1152, 7, 7], f16), T([1152], f16), T([1152], f16), T([1152], f16), T([1152], f32), T([1152], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 192, 7, 7], f16), T([128, 192, 7, 7], f16), T([192], f16), T([192], f16), T([192], f16), T([192], f32), T([192], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 576, 7, 7], f16), T([128, 576, 7, 7], f16), T([576], f16), T([576], f16), T([576], f16), T([576], f32), T([576], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 576, 14, 14], f16), T([128, 576, 14, 14], f16), T([576], f16), T([576], f16), T([576], f16), T([576], f32), T([576], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 96, 14, 14], f16), T([128, 96, 14, 14], f16), T([96], f16), T([96], f16), T([96], f16), T([96], f32), T([96], f32), True, 1e-05, [True, True, True]), {})
cnt: 6, ((T([128, 480, 14, 14], f16), T([128, 480, 14, 14], f16), T([480], f16), T([480], f16), T([480], f16), T([480], f32), T([480], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 80, 14, 14], f16), T([128, 80, 14, 14], f16), T([80], f16), T([80], f16), T([80], f16), T([80], f32), T([80], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([128, 240, 14, 14], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f32), T([240], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([128, 240, 28, 28], f16), T([240], f16), T([240], f16), T([240], f16), T([240], f32), T([240], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 40, 28, 28], f16), T([128, 40, 28, 28], f16), T([40], f16), T([40], f16), T([40], f16), T([40], f32), T([40], f32), True, 1e-05, [True, True, True]), {})
cnt: 4, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16), T([120], f16), T([120], f16), T([120], f16), T([120], f32), T([120], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([128, 72, 28, 28], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f32), T([72], f32), True, 1e-05, [True, True, True]), {})
cnt: 5, ((T([128, 72, 56, 56], f16), T([128, 72, 56, 56], f16), T([72], f16), T([72], f16), T([72], f16), T([72], f32), T([72], f32), True, 1e-05, [True, True, True]), {})
cnt: 3, ((T([128, 24, 56, 56], f16), T([128, 24, 56, 56], f16), T([24], f16), T([24], f16), T([24], f16), T([24], f32), T([24], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([128, 48, 56, 56], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f32), T([48], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 48, 112, 112], f16), T([128, 48, 112, 112], f16), T([48], f16), T([48], f16), T([48], f16), T([48], f32), T([48], f32), True, 1e-05, [True, True, True]), {})
cnt: 1, ((T([128, 16, 112, 112], f16), T([128, 16, 112, 112], f16), T([16], f16), T([16], f16), T([16], f16), T([16], f32), T([16], f32), True, 1e-05, [True, True, True]), {})
cnt: 2, ((T([128, 32, 112, 112], f16), T([128, 32, 112, 112], f16), T([32], f16), T([32], f16), T([32], f16), T([32], f32), T([32], f32), True, 1e-05, [True, True, True]), {})
Operator: aten.nll_loss_backward.default
cnt: 1, ((T([], f16), T([128, 1000], f16), T([128], i64), None, 1, -100, T([], f16)), {})
Operator: aten.nll_loss_forward.default
cnt: 1, ((T([128, 1000], f16), T([128], i64), None, 1, -100), {})
Operator: aten.relu_.default
cnt: 2, ((T([128, 32, 112, 112], f16),), {})
cnt: 1, ((T([128, 48, 112, 112], f16),), {})
cnt: 1, ((T([128, 48, 56, 56], f16),), {})
cnt: 5, ((T([128, 72, 56, 56], f16),), {})
cnt: 1, ((T([128, 72, 28, 28], f16),), {})
cnt: 4, ((T([128, 120, 28, 28], f16),), {})
cnt: 1, ((T([128, 240, 28, 28], f16),), {})
cnt: 1, ((T([128, 240, 14, 14], f16),), {})
cnt: 6, ((T([128, 480, 14, 14], f16),), {})
cnt: 3, ((T([128, 576, 14, 14], f16),), {})
cnt: 1, ((T([128, 576, 7, 7], f16),), {})
cnt: 8, ((T([128, 1152, 7, 7], f16),), {})
cnt: 1, ((T([128, 1280, 7, 7], f16),), {})
Operator: aten.sum.SymInt
cnt: 1, ((T([128, 1000], f16), [0], True), {})
Operator: aten.threshold_backward.default
cnt: 1, ((T([128, 1280, 7, 7], f16), T([128, 1280, 7, 7], f16), 0), {})
cnt: 8, ((T([128, 1152, 7, 7], f16), T([128, 1152, 7, 7], f16), 0), {})
cnt: 1, ((T([128, 576, 7, 7], f16), T([128, 576, 7, 7], f16), 0), {})
cnt: 3, ((T([128, 576, 14, 14], f16), T([128, 576, 14, 14], f16), 0), {})
cnt: 6, ((T([128, 480, 14, 14], f16), T([128, 480, 14, 14], f16), 0), {})
cnt: 1, ((T([128, 240, 14, 14], f16), T([128, 240, 14, 14], f16), 0), {})
cnt: 1, ((T([128, 240, 28, 28], f16), T([128, 240, 28, 28], f16), 0), {})
cnt: 4, ((T([128, 120, 28, 28], f16), T([128, 120, 28, 28], f16), 0), {})
cnt: 1, ((T([128, 72, 28, 28], f16), T([128, 72, 28, 28], f16), 0), {})
cnt: 5, ((T([128, 72, 56, 56], f16), T([128, 72, 56, 56], f16), 0), {})
cnt: 1, ((T([128, 48, 56, 56], f16), T([128, 48, 56, 56], f16), 0), {})
cnt: 1, ((T([128, 48, 112, 112], f16), T([128, 48, 112, 112], f16), 0), {})
cnt: 2, ((T([128, 32, 112, 112], f16), T([128, 32, 112, 112], f16), 0), {})

Some files were not shown because too many files have changed in this diff Show More