Compare commits

..

16 Commits

96 changed files with 6032 additions and 274 deletions

View File

@ -32,7 +32,7 @@
لتصدير نموذج 🤗 Transformers إلى ONNX، قم أولاً بتثبيت اعتماد إضافي:
```bash
pip install optimum-onnx
pip install optimum[exporters]
```
للاطلاع على جميع المعامﻻت المتاحة، يرجى الرجوع إلى [وثائق 🤗 Optimum](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model#exporting-a-model-to-onnx-using-the-cli)، أو عرض المساعدة في سطر الأوامر:
@ -111,3 +111,60 @@ optimum-cli export onnx --model keras-io/transformers-qa distilbert_base_cased_s
### تصدير نموذج لهندسة غير مدعومة
إذا كنت ترغب في المساهمة من خلال إضافة دعم لنموذج لا يُمكن تصديره حاليًا، فيجب عليك أولاً التحقق مما إذا كان مدعومًا في [`optimum.exporters.onnx`](https://huggingface.co/docs/optimum/exporters/onnx/overview)، وإذا لم يكن مدعومًا، [فيمكنك المساهمة في 🤗 Optimum](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/contribute) مُباشرةً.
### تصدير نموذج باستخدام `transformers.onnx`
<Tip warning={true}>
لم يعد يتم دعم `transformers.onnx` يُرجى تصدير النماذج باستخدام 🤗 Optimum كما هو موضح أعلاه. سيتم إزالة هذا القسم في الإصدارات القادمة.
</Tip>
لتصدير نموذج 🤗 Transformers إلى ONNX باستخدام `transformers.onnx`، ثبّت التبعيات الإضافية:
```bash
pip install transformers[onnx]
```
استخدم حزمة `transformers.onnx` كنموذج Python لتصدير نقطة حفظ باستخدام تكوين جاهز:
```bash
python -m transformers.onnx --model=distilbert/distilbert-base-uncased onnx/
```
يُصدّر هذا رسمًا بيانيًا ONNX لنقطة الحفظ المُحددة بواسطة وسيطة `--model`. مرر أي نقطة حفظ على 🤗 Hub أو نقطة حفظ مُخزنة محليًا.
يُمكن بعد ذلك تشغيل ملف `model.onnx` الناتج على أحد المُسرعات العديدة التي تدعم معيار ONNX. على سبيل المثال، قم بتحميل وتشغيل النموذج باستخدام ONNX Runtime كما يلي:
```python
>>> from transformers import AutoTokenizer
>>> from onnxruntime import InferenceSession
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
>>> session = InferenceSession("onnx/model.onnx")
>>> # يتوقع ONNX Runtime مصفوفات NumPy كمدخلات
>>> inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="np")
>>> outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))
```
يُمكن الحصول على أسماء المخرجات المطلوبة (مثل `["last_hidden_state"]`) من خلال إلقاء نظرة على تكوين ONNX لكل نموذج. على سبيل المثال، بالنسبة لـ DistilBERT، لدينا:
```python
>>> from transformers.models.distilbert import DistilBertConfig, DistilBertOnnxConfig
>>> config = DistilBertConfig()
>>> onnx_config = DistilBertOnnxConfig(config)
>>> print(list(onnx_config.outputs.keys()))
["last_hidden_state"]
```
العمليات مُتطابقة لنقاط الحفظ TensorFlow على Hub. على سبيل المثال، صدّر نقطة حفظ TensorFlow خالصة كما يلي:
```bash
python -m transformers.onnx --model=keras-io/transformers-qa onnx/
```
لتصدير نموذج مُخزن محليًا، احفظ أوزان النموذج ومجزىء اللغوى في نفس الدليل (على سبيل المثال `local-pt-checkpoint`)، ثم قم بتصديره إلى ONNX عن طريق توجيه وسيط `--model` لحزمة `transformers.onnx` إلى الدليل المطلوب:
```bash
python -m transformers.onnx --model=local-pt-checkpoint onnx/
```

View File

@ -33,7 +33,7 @@ Export a Transformers model to ONNX with the Optimum CLI or the `optimum.onnxrun
Run the command below to install Optimum and the [exporters](https://huggingface.co/docs/optimum/exporters/overview) module.
```bash
pip install optimum-onnx
pip install optimum[exporters]
```
> [!TIP]

View File

@ -0,0 +1,50 @@
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# Exporting 🤗 Transformers models to ONNX
🤗 Transformers は `transformers.onnx` パッケージを提供します。
設定オブジェクトを利用することで、モデルのチェックポイントをONNXグラフに変換することができます。
詳細は[ガイド](../serialization) を参照してください。
を参照してください。
## ONNX Configurations
以下の3つの抽象クラスを提供しています。
エクスポートしたいモデルアーキテクチャのタイプに応じて、継承すべき3つの抽象クラスを提供します
* エンコーダーベースのモデルは [`~onnx.config.OnnxConfig`] を継承します。
* デコーダーベースのモデルは [`~onnx.config.OnnxConfigWithPast`] を継承します。
* エンコーダー・デコーダーモデルは [`~onnx.config.OnnxSeq2SeqConfigWithPast`] を継承しています。
### OnnxConfig
[[autodoc]] onnx.config.OnnxConfig
### OnnxConfigWithPast
[[autodoc]] onnx.config.OnnxConfigWithPast
### OnnxSeq2SeqConfigWithPast
[[autodoc]] onnx.config.OnnxSeq2SeqConfigWithPast
## ONNX Features
各 ONNX 構成は、次のことを可能にする一連の _機能_ に関連付けられています。
さまざまなタイプのトポロジまたはタスクのモデルをエクスポートします。

View File

@ -47,7 +47,7 @@ ONNX形式にエクスポートされたモデルは、以下のように使用
🤗 TransformersモデルをONNXにエクスポートするには、まず追加の依存関係をインストールしてください
```bash
pip install optimum-onnx
pip install optimum[exporters]
```
すべての利用可能な引数を確認するには、[🤗 Optimumドキュメント](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model#exporting-a-model-to-onnx-using-the-cli)を参照してください。または、コマンドラインでヘルプを表示することもできます:
@ -128,3 +128,64 @@ CLIの代わりに、🤗 TransformersモデルをONNXにプログラム的に
### Exporting a model for an unsupported architecture
現在エクスポートできないモデルをサポートするために貢献したい場合、まず[`optimum.exporters.onnx`](https://huggingface.co/docs/optimum/exporters/onnx/overview)でサポートされているかどうかを確認し、サポートされていない場合は[🤗 Optimumに貢献](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/contribute)してください。
### Exporting a model with `transformers.onnx`
<Tip warning={true}>
`transformers.onnx`はもはやメンテナンスされていないため、モデルを上記で説明したように🤗 Optimumでエクスポートしてください。このセクションは将来のバージョンで削除されます。
</Tip>
🤗 TransformersモデルをONNXにエクスポートするには、追加の依存関係をインストールしてください
```bash
pip install transformers[onnx]
```
`transformers.onnx`パッケージをPythonモジュールとして使用して、事前に用意された設定を使用してチェックポイントをエクスポートする方法は以下の通りです
```bash
python -m transformers.onnx --model=distilbert/distilbert-base-uncased onnx/
```
この方法は、`--model`引数で定義されたチェックポイントのONNXグラフをエクスポートします。🤗 Hubのいずれかのチェックポイントまたはローカルに保存されたチェックポイントを渡すことができます。エクスポートされた`model.onnx`ファイルは、ONNX標準をサポートする多くのアクセラレータで実行できます。例えば、ONNX Runtimeを使用してモデルを読み込んで実行する方法は以下の通りです
```python
>>> from transformers import AutoTokenizer
>>> from onnxruntime import InferenceSession
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
>>> session = InferenceSession("onnx/model.onnx")
>>> # ONNX Runtime expects NumPy arrays as input
>>> inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="np")
>>> outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))
```
必要な出力名(例: `["last_hidden_state"]`は、各モデルのONNX構成を確認することで取得できます。例えば、DistilBERTの場合、次のようになります
```python
>>> from transformers.models.distilbert import DistilBertConfig, DistilBertOnnxConfig
>>> config = DistilBertConfig()
>>> onnx_config = DistilBertOnnxConfig(config)
>>> print(list(onnx_config.outputs.keys()))
["last_hidden_state"]
```
ハブから純粋なTensorFlowのチェックポイントをプログラム的にエクスポートするプロセスは、以下のように同様です
```bash
python -m transformers.onnx --model=keras-io/transformers-qa onnx/
```
ローカルに保存されたモデルをエクスポートする場合、モデルの重みとトークナイザのファイルを同じディレクトリに保存してください(例: `local-pt-checkpoint`)。その後、`transformers.onnx`パッケージの `--model`引数を希望するディレクトリに向けて設定して、ONNXにエクスポートします
```bash
python -m transformers.onnx --model=local-pt-checkpoint onnx/
```

View File

@ -0,0 +1,45 @@
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# 🤗 Transformers 모델을 ONNX로 내보내기[[exporting--transformers-models-to-onnx]]
🤗 트랜스포머는 `transformers.onnx` 패키지를 제공하며, 이 패키지는 설정 객체를 활용하여 모델 체크포인트를 ONNX 그래프로 변환할 수 있게 합니다.
🤗 Transformers에 대한 자세한 내용은 [이 가이드](../serialization)를 참조하세요.
## ONNX 설정[[onnx-configurations]]
내보내려는(export) 모델 아키텍처의 유형에 따라 상속받아야 할 세 가지 추상 클래스를 제공합니다:
* 인코더 기반 모델은 [`~onnx.config.OnnxConfig`]을 상속받습니다.
* 디코더 기반 모델은 [`~onnx.config.OnnxConfigWithPast`]을 상속받습니다.
* 인코더-디코더 기반 모델은 [`~onnx.config.OnnxSeq2SeqConfigWithPast`]을 상속받습니다.
### OnnxConfig[[transformers.onnx.OnnxConfig]]
[[autodoc]] onnx.config.OnnxConfig
### OnnxConfigWithPast[[transformers.onnx.OnnxConfigWithPast]]
[[autodoc]] onnx.config.OnnxConfigWithPast
### OnnxSeq2SeqConfigWithPast[[OnnxSeq2SeqConfigWithPast]]
[[autodoc]] onnx.config.OnnxSeq2SeqConfigWithPast
## ONNX 특징[[onnx-features]]
각 ONNX 설정은 다양한 유형의 토폴로지나 작업에 대해 모델을 내보낼 수 있게(exporting) 해주는 _features_ 세트와 연관되어 있습니다.

View File

@ -47,7 +47,7 @@ ONNX 형식으로 내보낸 모델은 다음과 같이 사용할 수 있습니
🤗 Transformers 모델을 ONNX로 내보내려면 먼저 추가 종속성을 설치하세요:
```bash
pip install optimum-onnx
pip install optimum[exporters]
```
사용 가능한 모든 인수를 확인하려면 [🤗 Optimum 문서](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model#exporting-a-model-to-onnx-using-the-cli)를 참조하거나 명령줄에서 도움말을 보세요.
@ -123,3 +123,59 @@ CLI 대신에 `optimum.onnxruntime`을 사용하여 프로그래밍 방식으로
### 지원되지 않는 아키텍처의 모델 내보내기 [[exporting-a-model-for-an-unsupported-architecture]]
현재 내보낼 수 없는 모델을 지원하기 위해 기여하려면, 먼저 [`optimum.exporters.onnx`](https://huggingface.co/docs/optimum/exporters/onnx/overview)에서 지원되는지 확인한 후 지원되지 않는 경우에는 [🤗 Optimum에 기여](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/contribute)하세요.
### `transformers.onnx`를 사용하여 모델 내보내기 [[exporting-a-model-with-transformersonnx]]
<Tip warning={true}>
`tranformers.onnx`는 더 이상 유지되지 않습니다. 위에서 설명한 대로 🤗 Optimum을 사용하여 모델을 내보내세요. 이 섹션은 향후 버전에서 제거될 예정입니다.
</Tip>
🤗 Transformers 모델을 ONNX로 내보내려면 추가 종속성을 설치하세요:
```bash
pip install transformers[onnx]
```
`transformers.onnx` 패키지를 Python 모듈로 사용하여 준비된 구성을 사용하여 체크포인트를 내보냅니다:
```bash
python -m transformers.onnx --model=distilbert/distilbert-base-uncased onnx/
```
이렇게 하면 `--model` 인수에 정의된 체크포인트의 ONNX 그래프가 내보내집니다. 🤗 Hub에서 제공하는 체크포인트나 로컬에 저장된 체크포인트를 전달할 수 있습니다. 결과로 생성된 `model.onnx` 파일은 ONNX 표준을 지원하는 많은 가속기 중 하나에서 실행할 수 있습니다. 예를 들어, 다음과 같이 ONNX Runtime을 사용하여 모델을 로드하고 실행할 수 있습니다:
```python
>>> from transformers import AutoTokenizer
>>> from onnxruntime import InferenceSession
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
>>> session = InferenceSession("onnx/model.onnx")
>>> # ONNX Runtime expects NumPy arrays as input
>>> inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="np")
>>> outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))
```
필요한 출력 이름(예: `["last_hidden_state"]`)은 각 모델의 ONNX 구성을 확인하여 얻을 수 있습니다. 예를 들어, DistilBERT의 경우 다음과 같습니다:
```python
>>> from transformers.models.distilbert import DistilBertConfig, DistilBertOnnxConfig
>>> config = DistilBertConfig()
>>> onnx_config = DistilBertOnnxConfig(config)
>>> print(list(onnx_config.outputs.keys()))
["last_hidden_state"]
```
Hub의 TensorFlow 체크포인트에 대해서도 동일한 프로세스가 적용됩니다. 예를 들어, 다음과 같이 순수한 TensorFlow 체크포인트를 내보냅니다:
```bash
python -m transformers.onnx --model=keras-io/transformers-qa onnx/
```
로컬에 저장된 모델을 내보내려면 모델의 가중치 파일과 토크나이저 파일을 동일한 디렉토리에 저장한 다음, transformers.onnx 패키지의 --model 인수를 원하는 디렉토리로 지정하여 ONNX로 내보냅니다:
```bash
python -m transformers.onnx --model=local-pt-checkpoint onnx/
```

View File

@ -0,0 +1,45 @@
<!--Copyright 2020 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# 导出 🤗 Transformers 模型到 ONNX
🤗 Transformers提供了一个`transformers.onnx`通过利用配置对象您可以将模型checkpoints转换为ONNX图。
有关更多详细信息,请参阅导出 🤗 Transformers 模型的[指南](../serialization)。
## ONNX Configurations
我们提供了三个抽象类,取决于您希望导出的模型架构类型:
* 基于编码器的模型继承 [`~onnx.config.OnnxConfig`]
* 基于解码器的模型继承 [`~onnx.config.OnnxConfigWithPast`]
* 编码器-解码器模型继承 [`~onnx.config.OnnxSeq2SeqConfigWithPast`]
### OnnxConfig
[[autodoc]] onnx.config.OnnxConfig
### OnnxConfigWithPast
[[autodoc]] onnx.config.OnnxConfigWithPast
### OnnxSeq2SeqConfigWithPast
[[autodoc]] onnx.config.OnnxSeq2SeqConfigWithPast
## ONNX Features
每个ONNX配置与一组 _特性_ 相关联,使您能够为不同类型的拓扑结构或任务导出模型。

View File

@ -47,7 +47,7 @@ rendered properly in your Markdown viewer.
要将 🤗 Transformers 模型导出为 ONNX首先需要安装额外的依赖项
```bash
pip install optimum-onnx
pip install optimum[exporters]
```
请参阅 [🤗 Optimum 文档](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model#exporting-a-model-to-onnx-using-the-cli) 以查看所有可用参数,或者在命令行中查看帮助:
@ -117,3 +117,53 @@ optimum-cli export onnx --model local_path --task question-answering distilbert_
### 导出尚未支持的架构的模型
如果你想要为当前无法导出的模型添加支持,请先检查 [`optimum.exporters.onnx`](https://huggingface.co/docs/optimum/exporters/onnx/overview) 是否支持该模型,如果不支持,你可以 [直接为 🤗 Optimum 贡献代码](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/contribute)。
### 使用 `transformers.onnx` 导出模型
<Tip warning={true}>
`transformers.onnx` 不再进行维护,请如上所述,使用 🤗 Optimum 导出模型。这部分内容将在未来版本中删除。
</Tip>
要使用 `transformers.onnx` 将 🤗 Transformers 模型导出为 ONNX请安装额外的依赖项
```bash
pip install transformers[onnx]
```
`transformers.onnx` 包作为 Python 模块使用,以使用现成的配置导出检查点:
```bash
python -m transformers.onnx --model=distilbert/distilbert-base-uncased onnx/
```
以上代码将导出由 `--model` 参数定义的检查点的 ONNX 图。传入任何 🤗 Hub 上或者存储与本地的检查点。生成的 `model.onnx` 文件可以在支持 ONNX 标准的众多加速引擎上运行。例如,使用 ONNX Runtime 加载并运行模型,如下所示:
```python
>>> from transformers import AutoTokenizer
>>> from onnxruntime import InferenceSession
>>> tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased")
>>> session = InferenceSession("onnx/model.onnx")
>>> # ONNX Runtime expects NumPy arrays as input
>>> inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="np")
>>> outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))
```
可以通过查看每个模型的 ONNX 配置来获取所需的输出名(例如 `["last_hidden_state"]`)。例如,对于 DistilBERT可以用以下代码获取输出名称
```python
>>> from transformers.models.distilbert import DistilBertConfig, DistilBertOnnxConfig
>>> config = DistilBertConfig()
>>> onnx_config = DistilBertOnnxConfig(config)
>>> print(list(onnx_config.outputs.keys()))
["last_hidden_state"]
```
要导出本地存储的模型,请将模型的权重和分词器文件保存在同一目录中(例如 `local-pt-checkpoint`),然后通过将 `transformers.onnx` 包的 `--model` 参数指向该目录,将其导出为 ONNX
```bash
python -m transformers.onnx --model=local-pt-checkpoint onnx/
```

View File

@ -129,6 +129,8 @@ _import_structure = {
],
"loss": [],
"modelcard": ["ModelCard"],
# Models
"onnx": [],
"pipelines": [
"AudioClassificationPipeline",
"AutomaticSpeechRecognitionPipeline",

View File

@ -442,6 +442,75 @@ def normalize(
return image
def unnormalize(
image: Union[np.ndarray, "torch.Tensor"],
mean: Union[float, Collection[float]],
std: Union[float, Collection[float]],
data_format: Optional[ChannelDimension] = None,
input_data_format: Optional[Union[str, ChannelDimension]] = None,
) -> np.ndarray:
"""
Inverse of `normalize`:
image = image * std + mean
Args:
image (`np.ndarray` or `torch.Tensor`):
The image to unnormalize.
mean (`float` or `Collection[float]`):
The mean to use for unnormalization.
std (`float` or `Collection[float]`):
The standard deviation to use for unnormalization.
data_format (`ChannelDimension`, *optional*):
The channel dimension format of the output image. If unset, will use the inferred format from the input.
input_data_format (`ChannelDimension`, *optional*):
The channel dimension format of the input image. If unset, will use the inferred format from the input.
Returns:
`np.ndarray`: The unnormalized image.
"""
is_torch_input = isinstance(image, torch.Tensor)
if is_torch_input:
image = image.detach().cpu().numpy()
elif not isinstance(image, np.ndarray):
raise TypeError("image must be a numpy array or a torch tensor")
if input_data_format is None:
input_data_format = infer_channel_dimension_format(image)
if not np.issubdtype(image.dtype, np.floating):
image = image.astype(np.float32)
channel_axis = get_channel_dimension_axis(image, input_data_format=input_data_format)
num_channels = image.shape[channel_axis]
if isinstance(mean, Collection):
if len(mean) != num_channels:
raise ValueError(f"mean must have {num_channels} elements if it is an iterable, got {len(mean)}")
else:
mean = [mean] * num_channels
mean = np.array(mean, dtype=image.dtype)
if isinstance(std, Collection):
if len(std) != num_channels:
raise ValueError(f"std must have {num_channels} elements if it is an iterable, got {len(std)}")
else:
std = [std] * num_channels
std = np.array(std, dtype=image.dtype)
if input_data_format == ChannelDimension.LAST:
image = image * std + mean
else:
shape = [1] * image.ndim
shape[channel_axis] = num_channels
mean = mean.reshape(shape)
std = std.reshape(shape)
image = image * std + mean
image = to_channel_dimension_format(image, data_format, input_data_format) if data_format is not None else image
return image
def center_crop(
image: np.ndarray,
size: tuple[int, int],

View File

@ -15,7 +15,11 @@
# limitations under the License.
"""ALBERT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
class AlbertConfig(PreTrainedConfig):
@ -138,4 +142,21 @@ class AlbertConfig(PreTrainedConfig):
self.classifier_dropout_prob = classifier_dropout_prob
__all__ = ["AlbertConfig"]
# Copied from transformers.models.bert.configuration_bert.BertOnnxConfig with Roberta->Albert
class AlbertOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
("token_type_ids", dynamic_axis),
]
)
__all__ = ["AlbertConfig", "AlbertOnnxConfig"]

View File

@ -15,9 +15,15 @@
"""BART model configuration"""
import warnings
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any
from ... import PreTrainedTokenizer
from ...configuration_utils import PreTrainedConfig
from ...utils import logging
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
from ...onnx.utils import compute_effective_axis_dimension
from ...utils import is_torch_available, logging
logger = logging.get_logger(__name__)
@ -174,4 +180,223 @@ class BartConfig(PreTrainedConfig):
)
__all__ = ["BartConfig"]
class BartOnnxConfig(OnnxSeq2SeqConfigWithPast):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task in ["default", "seq2seq-lm"]:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
common_inputs["decoder_input_ids"] = {0: "batch"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
else:
common_inputs["decoder_input_ids"] = {0: "batch", 1: "decoder_sequence"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "decoder_sequence"}
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
elif self.task == "causal-lm":
# TODO: figure this case out.
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
num_encoder_layers, _ = self.num_layers
for i in range(num_encoder_layers):
common_inputs[f"past_key_values.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
common_inputs[f"past_key_values.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
else:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
("decoder_input_ids", {0: "batch", 1: "decoder_sequence"}),
("decoder_attention_mask", {0: "batch", 1: "decoder_sequence"}),
]
)
return common_inputs
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task in ["default", "seq2seq-lm"]:
common_outputs = super().outputs
else:
common_outputs = super(OnnxConfigWithPast, self).outputs
if self.use_past:
num_encoder_layers, _ = self.num_layers
for i in range(num_encoder_layers):
common_outputs[f"present.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
common_outputs[f"present.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
return common_outputs
def _generate_dummy_inputs_for_default_and_seq2seq_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
encoder_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, seq_length, is_pair
)
# Generate decoder inputs
decoder_seq_length = seq_length if not self.use_past else 1
decoder_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, decoder_seq_length, is_pair
)
decoder_inputs = {f"decoder_{name}": tensor for name, tensor in decoder_inputs.items()}
common_inputs = dict(**encoder_inputs, **decoder_inputs)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, encoder_seq_length = common_inputs["input_ids"].shape
decoder_seq_length = common_inputs["decoder_input_ids"].shape[1]
num_encoder_attention_heads, num_decoder_attention_heads = self.num_attention_heads
encoder_shape = (
batch,
num_encoder_attention_heads,
encoder_seq_length,
self._config.hidden_size // num_encoder_attention_heads,
)
decoder_past_length = decoder_seq_length + 3
decoder_shape = (
batch,
num_decoder_attention_heads,
decoder_past_length,
self._config.hidden_size // num_decoder_attention_heads,
)
common_inputs["decoder_attention_mask"] = torch.cat(
[common_inputs["decoder_attention_mask"], torch.ones(batch, decoder_past_length)], dim=1
)
common_inputs["past_key_values"] = []
# If the number of encoder and decoder layers are present in the model configuration, both are considered
num_encoder_layers, num_decoder_layers = self.num_layers
min_num_layers = min(num_encoder_layers, num_decoder_layers)
max_num_layers = max(num_encoder_layers, num_decoder_layers) - min_num_layers
remaining_side_name = "encoder" if num_encoder_layers > num_decoder_layers else "decoder"
for _ in range(min_num_layers):
common_inputs["past_key_values"].append(
(
torch.zeros(decoder_shape),
torch.zeros(decoder_shape),
torch.zeros(encoder_shape),
torch.zeros(encoder_shape),
)
)
# TODO: test this.
shape = encoder_shape if remaining_side_name == "encoder" else decoder_shape
for _ in range(min_num_layers, max_num_layers):
common_inputs["past_key_values"].append((torch.zeros(shape), torch.zeros(shape)))
return common_inputs
def _generate_dummy_inputs_for_causal_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
common_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, seq_length, is_pair
)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, seqlen = common_inputs["input_ids"].shape
# Not using the same length for past_key_values
past_key_values_length = seqlen + 2
num_encoder_layers, _ = self.num_layers
num_encoder_attention_heads, _ = self.num_attention_heads
past_shape = (
batch,
num_encoder_attention_heads,
past_key_values_length,
self._config.hidden_size // num_encoder_attention_heads,
)
mask_dtype = common_inputs["attention_mask"].dtype
common_inputs["attention_mask"] = torch.cat(
[common_inputs["attention_mask"], torch.ones(batch, past_key_values_length, dtype=mask_dtype)], dim=1
)
common_inputs["past_key_values"] = [
(torch.zeros(past_shape), torch.zeros(past_shape)) for _ in range(num_encoder_layers)
]
return common_inputs
def _generate_dummy_inputs_for_sequence_classification_and_question_answering(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
# Copied from OnnxConfig.generate_dummy_inputs
# Did not use super(OnnxConfigWithPast, self).generate_dummy_inputs for code clarity.
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(
batch_size, fixed_dimension=OnnxConfig.default_fixed_batch, num_token_to_add=0
)
# If dynamic axis (-1) we forward with a fixed dimension of 8 tokens to avoid optimizations made by ONNX
token_to_add = tokenizer.num_special_tokens_to_add(is_pair)
seq_length = compute_effective_axis_dimension(
seq_length, fixed_dimension=OnnxConfig.default_fixed_sequence, num_token_to_add=token_to_add
)
# Generate dummy inputs according to compute batch and sequence
dummy_input = [" ".join([tokenizer.unk_token]) * seq_length] * batch_size
common_inputs = dict(tokenizer(dummy_input, return_tensors="pt"))
return common_inputs
def generate_dummy_inputs(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
if self.task in ["default", "seq2seq-lm"]:
common_inputs = self._generate_dummy_inputs_for_default_and_seq2seq_lm(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
elif self.task == "causal-lm":
common_inputs = self._generate_dummy_inputs_for_causal_lm(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
else:
common_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
return common_inputs
def _flatten_past_key_values_(self, flattened_output, name, idx, t):
if self.task in ["default", "seq2seq-lm"]:
flattened_output = super()._flatten_past_key_values_(flattened_output, name, idx, t)
else:
flattened_output = super(OnnxSeq2SeqConfigWithPast, self)._flatten_past_key_values_(
flattened_output, name, idx, t
)
__all__ = ["BartConfig", "BartOnnxConfig"]

View File

@ -15,8 +15,13 @@
"""BEiT model configuration"""
import warnings
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_features_output_indices
@ -204,4 +209,21 @@ class BeitConfig(BackboneConfigMixin, PreTrainedConfig):
self.reshape_hidden_states = reshape_hidden_states
__all__ = ["BeitConfig"]
# Copied from transformers.models.vit.configuration_vit.ViTOnnxConfig
class BeitOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["BeitConfig", "BeitOnnxConfig"]

View File

@ -15,7 +15,11 @@
# limitations under the License.
"""BERT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -123,4 +127,20 @@ class BertConfig(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["BertConfig"]
class BertOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
("token_type_ids", dynamic_axis),
]
)
__all__ = ["BertConfig", "BertOnnxConfig"]

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""BigBird model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -154,4 +158,19 @@ class BigBirdConfig(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["BigBirdConfig"]
class BigBirdOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
__all__ = ["BigBirdConfig", "BigBirdOnnxConfig"]

View File

@ -14,8 +14,15 @@
# limitations under the License.
"""BigBirdPegasus model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any
from ... import PreTrainedTokenizer
from ...configuration_utils import PreTrainedConfig
from ...utils import logging
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
from ...onnx.utils import compute_effective_axis_dimension
from ...utils import is_torch_available, logging
logger = logging.get_logger(__name__)
@ -179,4 +186,224 @@ class BigBirdPegasusConfig(PreTrainedConfig):
)
__all__ = ["BigBirdPegasusConfig"]
# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig with Bart->BigBirdPegasus
class BigBirdPegasusOnnxConfig(OnnxSeq2SeqConfigWithPast):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task in ["default", "seq2seq-lm"]:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
common_inputs["decoder_input_ids"] = {0: "batch"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
else:
common_inputs["decoder_input_ids"] = {0: "batch", 1: "decoder_sequence"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "decoder_sequence"}
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
elif self.task == "causal-lm":
# TODO: figure this case out.
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
num_encoder_layers, _ = self.num_layers
for i in range(num_encoder_layers):
common_inputs[f"past_key_values.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
common_inputs[f"past_key_values.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
else:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
("decoder_input_ids", {0: "batch", 1: "decoder_sequence"}),
("decoder_attention_mask", {0: "batch", 1: "decoder_sequence"}),
]
)
return common_inputs
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task in ["default", "seq2seq-lm"]:
common_outputs = super().outputs
else:
common_outputs = super(OnnxConfigWithPast, self).outputs
if self.use_past:
num_encoder_layers, _ = self.num_layers
for i in range(num_encoder_layers):
common_outputs[f"present.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
common_outputs[f"present.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
return common_outputs
def _generate_dummy_inputs_for_default_and_seq2seq_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
encoder_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, seq_length, is_pair
)
# Generate decoder inputs
decoder_seq_length = seq_length if not self.use_past else 1
decoder_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, decoder_seq_length, is_pair
)
decoder_inputs = {f"decoder_{name}": tensor for name, tensor in decoder_inputs.items()}
common_inputs = dict(**encoder_inputs, **decoder_inputs)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, encoder_seq_length = common_inputs["input_ids"].shape
decoder_seq_length = common_inputs["decoder_input_ids"].shape[1]
num_encoder_attention_heads, num_decoder_attention_heads = self.num_attention_heads
encoder_shape = (
batch,
num_encoder_attention_heads,
encoder_seq_length,
self._config.hidden_size // num_encoder_attention_heads,
)
decoder_past_length = decoder_seq_length + 3
decoder_shape = (
batch,
num_decoder_attention_heads,
decoder_past_length,
self._config.hidden_size // num_decoder_attention_heads,
)
common_inputs["decoder_attention_mask"] = torch.cat(
[common_inputs["decoder_attention_mask"], torch.ones(batch, decoder_past_length)], dim=1
)
common_inputs["past_key_values"] = []
# If the number of encoder and decoder layers are present in the model configuration, both are considered
num_encoder_layers, num_decoder_layers = self.num_layers
min_num_layers = min(num_encoder_layers, num_decoder_layers)
max_num_layers = max(num_encoder_layers, num_decoder_layers) - min_num_layers
remaining_side_name = "encoder" if num_encoder_layers > num_decoder_layers else "decoder"
for _ in range(min_num_layers):
common_inputs["past_key_values"].append(
(
torch.zeros(decoder_shape),
torch.zeros(decoder_shape),
torch.zeros(encoder_shape),
torch.zeros(encoder_shape),
)
)
# TODO: test this.
shape = encoder_shape if remaining_side_name == "encoder" else decoder_shape
for _ in range(min_num_layers, max_num_layers):
common_inputs["past_key_values"].append((torch.zeros(shape), torch.zeros(shape)))
return common_inputs
def _generate_dummy_inputs_for_causal_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
common_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, seq_length, is_pair
)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, seqlen = common_inputs["input_ids"].shape
# Not using the same length for past_key_values
past_key_values_length = seqlen + 2
num_encoder_layers, _ = self.num_layers
num_encoder_attention_heads, _ = self.num_attention_heads
past_shape = (
batch,
num_encoder_attention_heads,
past_key_values_length,
self._config.hidden_size // num_encoder_attention_heads,
)
mask_dtype = common_inputs["attention_mask"].dtype
common_inputs["attention_mask"] = torch.cat(
[common_inputs["attention_mask"], torch.ones(batch, past_key_values_length, dtype=mask_dtype)], dim=1
)
common_inputs["past_key_values"] = [
(torch.zeros(past_shape), torch.zeros(past_shape)) for _ in range(num_encoder_layers)
]
return common_inputs
def _generate_dummy_inputs_for_sequence_classification_and_question_answering(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
# Copied from OnnxConfig.generate_dummy_inputs
# Did not use super(OnnxConfigWithPast, self).generate_dummy_inputs for code clarity.
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(
batch_size, fixed_dimension=OnnxConfig.default_fixed_batch, num_token_to_add=0
)
# If dynamic axis (-1) we forward with a fixed dimension of 8 tokens to avoid optimizations made by ONNX
token_to_add = tokenizer.num_special_tokens_to_add(is_pair)
seq_length = compute_effective_axis_dimension(
seq_length, fixed_dimension=OnnxConfig.default_fixed_sequence, num_token_to_add=token_to_add
)
# Generate dummy inputs according to compute batch and sequence
dummy_input = [" ".join([tokenizer.unk_token]) * seq_length] * batch_size
common_inputs = dict(tokenizer(dummy_input, return_tensors="pt"))
return common_inputs
def generate_dummy_inputs(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
if self.task in ["default", "seq2seq-lm"]:
common_inputs = self._generate_dummy_inputs_for_default_and_seq2seq_lm(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
elif self.task == "causal-lm":
common_inputs = self._generate_dummy_inputs_for_causal_lm(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
else:
common_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
return common_inputs
def _flatten_past_key_values_(self, flattened_output, name, idx, t):
if self.task in ["default", "seq2seq-lm"]:
flattened_output = super()._flatten_past_key_values_(flattened_output, name, idx, t)
else:
flattened_output = super(OnnxSeq2SeqConfigWithPast, self)._flatten_past_key_values_(
flattened_output, name, idx, t
)
__all__ = ["BigBirdPegasusConfig", "BigBirdPegasusOnnxConfig"]

View File

@ -14,7 +14,15 @@
# limitations under the License.
"""Blenderbot model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any
from ... import PreTrainedTokenizer
from ...configuration_utils import PreTrainedConfig
from ...file_utils import is_torch_available
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
from ...onnx.utils import compute_effective_axis_dimension
from ...utils import logging
@ -158,4 +166,227 @@ class BlenderbotConfig(PreTrainedConfig):
)
__all__ = ["BlenderbotConfig"]
class BlenderbotOnnxConfig(OnnxSeq2SeqConfigWithPast):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task in ["default", "seq2seq-lm"]:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
common_inputs["decoder_input_ids"] = {0: "batch"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
else:
common_inputs["decoder_input_ids"] = {0: "batch", 1: "decoder_sequence"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "decoder_sequence"}
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
elif self.task == "causal-lm":
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
_, num_decoder_layers = self.num_layers
for i in range(num_decoder_layers):
common_inputs[f"past_key_values.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
common_inputs[f"past_key_values.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
else:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
("decoder_input_ids", {0: "batch", 1: "decoder_sequence"}),
("decoder_attention_mask", {0: "batch", 1: "decoder_sequence"}),
]
)
return common_inputs
@property
# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig.outputs
def outputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task in ["default", "seq2seq-lm"]:
common_outputs = super().outputs
else:
common_outputs = super(OnnxConfigWithPast, self).outputs
if self.use_past:
num_encoder_layers, _ = self.num_layers
for i in range(num_encoder_layers):
common_outputs[f"present.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
common_outputs[f"present.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
return common_outputs
def _generate_dummy_inputs_for_default_and_seq2seq_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
encoder_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, seq_length, is_pair
)
# Generate decoder inputs
decoder_seq_length = seq_length if not self.use_past else 1
decoder_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, decoder_seq_length, is_pair
)
decoder_inputs = {f"decoder_{name}": tensor for name, tensor in decoder_inputs.items()}
common_inputs = dict(**encoder_inputs, **decoder_inputs)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, encoder_seq_length = common_inputs["input_ids"].shape
decoder_seq_length = common_inputs["decoder_input_ids"].shape[1]
num_encoder_attention_heads, num_decoder_attention_heads = self.num_attention_heads
encoder_shape = (
batch,
num_encoder_attention_heads,
encoder_seq_length,
self._config.hidden_size // num_encoder_attention_heads,
)
decoder_past_length = decoder_seq_length
decoder_shape = (
batch,
num_decoder_attention_heads,
decoder_past_length,
self._config.hidden_size // num_decoder_attention_heads,
)
common_inputs["decoder_attention_mask"] = torch.cat(
[common_inputs["decoder_attention_mask"], torch.ones(batch, decoder_past_length)], dim=1
)
common_inputs["past_key_values"] = []
_, num_decoder_layers = self.num_layers
for _ in range(num_decoder_layers):
common_inputs["past_key_values"].append(
(
torch.zeros(decoder_shape),
torch.zeros(decoder_shape),
torch.zeros(encoder_shape),
torch.zeros(encoder_shape),
)
)
return common_inputs
def _generate_dummy_inputs_for_causal_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
common_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, seq_length, is_pair
)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, seqlen = common_inputs["input_ids"].shape
past_key_values_length = seqlen
_, num_decoder_layers = self.num_layers
num_encoder_attention_heads, _ = self.num_attention_heads
past_shape = (
batch,
num_encoder_attention_heads,
past_key_values_length,
self._config.hidden_size // num_encoder_attention_heads,
)
mask_dtype = common_inputs["attention_mask"].dtype
common_inputs["attention_mask"] = torch.cat(
[common_inputs["attention_mask"], torch.ones(batch, past_key_values_length, dtype=mask_dtype)], dim=1
)
common_inputs["past_key_values"] = [
(torch.zeros(past_shape), torch.zeros(past_shape)) for _ in range(num_decoder_layers)
]
return common_inputs
# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig._generate_dummy_inputs_for_sequence_classification_and_question_answering
def _generate_dummy_inputs_for_sequence_classification_and_question_answering(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
# Copied from OnnxConfig.generate_dummy_inputs
# Did not use super(OnnxConfigWithPast, self).generate_dummy_inputs for code clarity.
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(
batch_size, fixed_dimension=OnnxConfig.default_fixed_batch, num_token_to_add=0
)
# If dynamic axis (-1) we forward with a fixed dimension of 8 tokens to avoid optimizations made by ONNX
token_to_add = tokenizer.num_special_tokens_to_add(is_pair)
seq_length = compute_effective_axis_dimension(
seq_length, fixed_dimension=OnnxConfig.default_fixed_sequence, num_token_to_add=token_to_add
)
# Generate dummy inputs according to compute batch and sequence
dummy_input = [" ".join([tokenizer.unk_token]) * seq_length] * batch_size
common_inputs = dict(tokenizer(dummy_input, return_tensors="pt"))
return common_inputs
# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig.generate_dummy_inputs
def generate_dummy_inputs(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
if self.task in ["default", "seq2seq-lm"]:
common_inputs = self._generate_dummy_inputs_for_default_and_seq2seq_lm(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
elif self.task == "causal-lm":
common_inputs = self._generate_dummy_inputs_for_causal_lm(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
else:
common_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
return common_inputs
# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig._flatten_past_key_values_
def _flatten_past_key_values_(self, flattened_output, name, idx, t):
if self.task in ["default", "seq2seq-lm"]:
flattened_output = super()._flatten_past_key_values_(flattened_output, name, idx, t)
else:
flattened_output = super(OnnxSeq2SeqConfigWithPast, self)._flatten_past_key_values_(
flattened_output, name, idx, t
)
def fill_with_past_key_values_(self, inputs_or_outputs: Mapping[str, Mapping[int, str]], direction: str):
if direction not in ["inputs", "outputs"]:
raise ValueError(f'direction must either be "inputs" or "outputs", but {direction} was given')
name = "past_key_values" if direction == "inputs" else "present"
_, num_decoder_layers = self.num_layers
encoder_sequence = "past_encoder_sequence"
decoder_sequence = "past_decoder_sequence" if direction == "inputs" else "past_decoder_sequence + sequence"
for i in range(num_decoder_layers):
inputs_or_outputs[f"{name}.{i}.decoder.key"] = {0: "batch", 2: decoder_sequence}
inputs_or_outputs[f"{name}.{i}.decoder.value"] = {0: "batch", 2: decoder_sequence}
inputs_or_outputs[f"{name}.{i}.encoder.key"] = {0: "batch", 2: encoder_sequence}
inputs_or_outputs[f"{name}.{i}.encoder.value"] = {0: "batch", 2: encoder_sequence}
__all__ = ["BlenderbotConfig", "BlenderbotOnnxConfig"]

View File

@ -14,7 +14,15 @@
# limitations under the License.
"""BlenderbotSmall model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any
from ... import PreTrainedTokenizer
from ...configuration_utils import PreTrainedConfig
from ...file_utils import is_torch_available
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
from ...onnx.utils import compute_effective_axis_dimension
from ...utils import logging
@ -156,4 +164,224 @@ class BlenderbotSmallConfig(PreTrainedConfig):
)
__all__ = ["BlenderbotSmallConfig"]
# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig with Bart->BlenderbotSmall
class BlenderbotSmallOnnxConfig(OnnxSeq2SeqConfigWithPast):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task in ["default", "seq2seq-lm"]:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
common_inputs["decoder_input_ids"] = {0: "batch"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
else:
common_inputs["decoder_input_ids"] = {0: "batch", 1: "decoder_sequence"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "decoder_sequence"}
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
elif self.task == "causal-lm":
# TODO: figure this case out.
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
num_encoder_layers, _ = self.num_layers
for i in range(num_encoder_layers):
common_inputs[f"past_key_values.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
common_inputs[f"past_key_values.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
else:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
("decoder_input_ids", {0: "batch", 1: "decoder_sequence"}),
("decoder_attention_mask", {0: "batch", 1: "decoder_sequence"}),
]
)
return common_inputs
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task in ["default", "seq2seq-lm"]:
common_outputs = super().outputs
else:
common_outputs = super(OnnxConfigWithPast, self).outputs
if self.use_past:
num_encoder_layers, _ = self.num_layers
for i in range(num_encoder_layers):
common_outputs[f"present.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
common_outputs[f"present.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
return common_outputs
def _generate_dummy_inputs_for_default_and_seq2seq_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
encoder_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, seq_length, is_pair
)
# Generate decoder inputs
decoder_seq_length = seq_length if not self.use_past else 1
decoder_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, decoder_seq_length, is_pair
)
decoder_inputs = {f"decoder_{name}": tensor for name, tensor in decoder_inputs.items()}
common_inputs = dict(**encoder_inputs, **decoder_inputs)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, encoder_seq_length = common_inputs["input_ids"].shape
decoder_seq_length = common_inputs["decoder_input_ids"].shape[1]
num_encoder_attention_heads, num_decoder_attention_heads = self.num_attention_heads
encoder_shape = (
batch,
num_encoder_attention_heads,
encoder_seq_length,
self._config.hidden_size // num_encoder_attention_heads,
)
decoder_past_length = decoder_seq_length + 3
decoder_shape = (
batch,
num_decoder_attention_heads,
decoder_past_length,
self._config.hidden_size // num_decoder_attention_heads,
)
common_inputs["decoder_attention_mask"] = torch.cat(
[common_inputs["decoder_attention_mask"], torch.ones(batch, decoder_past_length)], dim=1
)
common_inputs["past_key_values"] = []
# If the number of encoder and decoder layers are present in the model configuration, both are considered
num_encoder_layers, num_decoder_layers = self.num_layers
min_num_layers = min(num_encoder_layers, num_decoder_layers)
max_num_layers = max(num_encoder_layers, num_decoder_layers) - min_num_layers
remaining_side_name = "encoder" if num_encoder_layers > num_decoder_layers else "decoder"
for _ in range(min_num_layers):
common_inputs["past_key_values"].append(
(
torch.zeros(decoder_shape),
torch.zeros(decoder_shape),
torch.zeros(encoder_shape),
torch.zeros(encoder_shape),
)
)
# TODO: test this.
shape = encoder_shape if remaining_side_name == "encoder" else decoder_shape
for _ in range(min_num_layers, max_num_layers):
common_inputs["past_key_values"].append((torch.zeros(shape), torch.zeros(shape)))
return common_inputs
def _generate_dummy_inputs_for_causal_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
common_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, seq_length, is_pair
)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, seqlen = common_inputs["input_ids"].shape
# Not using the same length for past_key_values
past_key_values_length = seqlen + 2
num_encoder_layers, _ = self.num_layers
num_encoder_attention_heads, _ = self.num_attention_heads
past_shape = (
batch,
num_encoder_attention_heads,
past_key_values_length,
self._config.hidden_size // num_encoder_attention_heads,
)
mask_dtype = common_inputs["attention_mask"].dtype
common_inputs["attention_mask"] = torch.cat(
[common_inputs["attention_mask"], torch.ones(batch, past_key_values_length, dtype=mask_dtype)], dim=1
)
common_inputs["past_key_values"] = [
(torch.zeros(past_shape), torch.zeros(past_shape)) for _ in range(num_encoder_layers)
]
return common_inputs
def _generate_dummy_inputs_for_sequence_classification_and_question_answering(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
# Copied from OnnxConfig.generate_dummy_inputs
# Did not use super(OnnxConfigWithPast, self).generate_dummy_inputs for code clarity.
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(
batch_size, fixed_dimension=OnnxConfig.default_fixed_batch, num_token_to_add=0
)
# If dynamic axis (-1) we forward with a fixed dimension of 8 tokens to avoid optimizations made by ONNX
token_to_add = tokenizer.num_special_tokens_to_add(is_pair)
seq_length = compute_effective_axis_dimension(
seq_length, fixed_dimension=OnnxConfig.default_fixed_sequence, num_token_to_add=token_to_add
)
# Generate dummy inputs according to compute batch and sequence
dummy_input = [" ".join([tokenizer.unk_token]) * seq_length] * batch_size
common_inputs = dict(tokenizer(dummy_input, return_tensors="pt"))
return common_inputs
def generate_dummy_inputs(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
if self.task in ["default", "seq2seq-lm"]:
common_inputs = self._generate_dummy_inputs_for_default_and_seq2seq_lm(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
elif self.task == "causal-lm":
common_inputs = self._generate_dummy_inputs_for_causal_lm(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
else:
common_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
return common_inputs
def _flatten_past_key_values_(self, flattened_output, name, idx, t):
if self.task in ["default", "seq2seq-lm"]:
flattened_output = super()._flatten_past_key_values_(flattened_output, name, idx, t)
else:
flattened_output = super(OnnxSeq2SeqConfigWithPast, self)._flatten_past_key_values_(
flattened_output, name, idx, t
)
__all__ = ["BlenderbotSmallConfig", "BlenderbotSmallOnnxConfig"]

View File

@ -14,8 +14,19 @@
# limitations under the License.
"""Bloom configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any, Optional
from packaging import version
if TYPE_CHECKING:
from ... import PreTrainedTokenizer
from ...configuration_utils import PreTrainedConfig
from ...utils import logging
from ...onnx import OnnxConfigWithPast, PatchingSpec
from ...utils import is_torch_available, logging
logger = logging.get_logger(__name__)
@ -131,4 +142,99 @@ class BloomConfig(PreTrainedConfig):
super().__init__(bos_token_id=bos_token_id, eos_token_id=eos_token_id, **kwargs)
__all__ = ["BloomConfig"]
class BloomOnnxConfig(OnnxConfigWithPast):
torch_onnx_minimum_version = version.parse("1.12")
def __init__(
self,
config: PreTrainedConfig,
task: str = "default",
patching_specs: Optional[list[PatchingSpec]] = None,
use_past: bool = False,
):
super().__init__(config, task=task, patching_specs=patching_specs, use_past=use_past)
if not getattr(self._config, "pad_token_id", None):
# TODO: how to do that better?
self._config.pad_token_id = 0
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
common_inputs = OrderedDict({"input_ids": {0: "batch", 1: "sequence"}})
if self.use_past:
# BLOOM stores values on dynamic axis 2. For more details see: https://github.com/huggingface/transformers/pull/18344
self.fill_with_past_key_values_(common_inputs, direction="inputs", inverted_values_shape=True)
common_inputs["attention_mask"] = {0: "batch", 1: "past_sequence + sequence"}
else:
common_inputs["attention_mask"] = {0: "batch", 1: "sequence"}
return common_inputs
@property
def num_layers(self) -> int:
return self._config.n_layer
@property
def num_attention_heads(self) -> int:
return self._config.n_head
@property
def atol_for_validation(self) -> float:
return 1e-3
def generate_dummy_inputs(
self,
tokenizer: "PreTrainedTokenizer",
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
common_inputs = super(OnnxConfigWithPast, self).generate_dummy_inputs(
tokenizer,
batch_size=batch_size,
seq_length=seq_length,
is_pair=is_pair,
)
# We need to order the input in the way they appears in the forward()
ordered_inputs = OrderedDict({"input_ids": common_inputs["input_ids"]})
# Need to add the past_keys
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, seqlen = common_inputs["input_ids"].shape
# Not using the same length for past_key_values
past_key_values_length = seqlen + 2
head_dim = self._config.hidden_size // self.num_attention_heads
past_key_shape = (
batch * self.num_attention_heads,
head_dim,
past_key_values_length,
)
past_value_shape = (
batch * self.num_attention_heads,
past_key_values_length,
head_dim,
)
ordered_inputs["past_key_values"] = [
(torch.zeros(past_key_shape), torch.zeros(past_value_shape)) for _ in range(self.num_layers)
]
ordered_inputs["attention_mask"] = common_inputs["attention_mask"]
if self.use_past:
mask_dtype = ordered_inputs["attention_mask"].dtype
ordered_inputs["attention_mask"] = torch.cat(
[ordered_inputs["attention_mask"], torch.ones(batch, past_key_values_length, dtype=mask_dtype)], dim=1
)
return ordered_inputs
@property
def default_onnx_opset(self) -> int:
return 13
__all__ = ["BloomConfig", "BloomOnnxConfig"]

View File

@ -15,7 +15,11 @@
# limitations under the License.
"""CamemBERT configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -125,4 +129,19 @@ class CamembertConfig(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["CamembertConfig"]
class CamembertOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
__all__ = ["CamembertConfig", "CamembertOnnxConfig"]

View File

@ -14,7 +14,16 @@
# limitations under the License.
"""Chinese-CLIP model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any
if TYPE_CHECKING:
from ...processing_utils import ProcessorMixin
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -359,4 +368,52 @@ class ChineseCLIPConfig(PreTrainedConfig):
super().__init__(**kwargs)
__all__ = ["ChineseCLIPConfig", "ChineseCLIPTextConfig", "ChineseCLIPVisionConfig"]
class ChineseCLIPOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
("attention_mask", {0: "batch", 1: "sequence"}),
]
)
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("logits_per_image", {0: "batch"}),
("logits_per_text", {0: "batch"}),
("text_embeds", {0: "batch"}),
("image_embeds", {0: "batch"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
def generate_dummy_inputs(
self,
processor: "ProcessorMixin",
batch_size: int = -1,
seq_length: int = -1,
) -> Mapping[str, Any]:
text_input_dict = super().generate_dummy_inputs(
processor.tokenizer,
batch_size=batch_size,
seq_length=seq_length,
)
image_input_dict = super().generate_dummy_inputs(
processor.image_processor,
batch_size=batch_size,
)
return {**text_input_dict, **image_input_dict}
@property
def default_onnx_opset(self) -> int:
return 14
__all__ = ["ChineseCLIPConfig", "ChineseCLIPOnnxConfig", "ChineseCLIPTextConfig", "ChineseCLIPVisionConfig"]

View File

@ -14,7 +14,16 @@
# limitations under the License.
"""CLIP model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any
if TYPE_CHECKING:
from ...processing_utils import ProcessorMixin
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -355,4 +364,52 @@ class CLIPConfig(PreTrainedConfig):
super().__init__(**kwargs)
__all__ = ["CLIPConfig", "CLIPTextConfig", "CLIPVisionConfig"]
class CLIPOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
("attention_mask", {0: "batch", 1: "sequence"}),
]
)
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("logits_per_image", {0: "batch"}),
("logits_per_text", {0: "batch"}),
("text_embeds", {0: "batch"}),
("image_embeds", {0: "batch"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
def generate_dummy_inputs(
self,
processor: "ProcessorMixin",
batch_size: int = -1,
seq_length: int = -1,
) -> Mapping[str, Any]:
text_input_dict = super().generate_dummy_inputs(
processor.tokenizer,
batch_size=batch_size,
seq_length=seq_length,
)
image_input_dict = super().generate_dummy_inputs(
processor.image_processor,
batch_size=batch_size,
)
return {**text_input_dict, **image_input_dict}
@property
def default_onnx_opset(self) -> int:
return 14
__all__ = ["CLIPConfig", "CLIPOnnxConfig", "CLIPTextConfig", "CLIPVisionConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""CodeGen model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any, Optional
from ... import PreTrainedTokenizer, is_torch_available
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfigWithPast, PatchingSpec
from ...utils import logging
@ -140,4 +146,85 @@ class CodeGenConfig(PreTrainedConfig):
)
__all__ = ["CodeGenConfig"]
# Copied from transformers.models.gpt2.configuration_gpt2.GPT2OnnxConfig with GPT2->CodeGen
class CodeGenOnnxConfig(OnnxConfigWithPast):
def __init__(
self,
config: PreTrainedConfig,
task: str = "default",
patching_specs: Optional[list[PatchingSpec]] = None,
use_past: bool = False,
):
super().__init__(config, task=task, patching_specs=patching_specs, use_past=use_past)
if not getattr(self._config, "pad_token_id", None):
# TODO: how to do that better?
self._config.pad_token_id = 0
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
common_inputs = OrderedDict({"input_ids": {0: "batch", 1: "sequence"}})
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
common_inputs["attention_mask"] = {0: "batch", 1: "past_sequence + sequence"}
else:
common_inputs["attention_mask"] = {0: "batch", 1: "sequence"}
return common_inputs
@property
def num_layers(self) -> int:
return self._config.n_layer
@property
def num_attention_heads(self) -> int:
return self._config.n_head
def generate_dummy_inputs(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
common_inputs = super(OnnxConfigWithPast, self).generate_dummy_inputs(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
# We need to order the input in the way they appears in the forward()
ordered_inputs = OrderedDict({"input_ids": common_inputs["input_ids"]})
# Need to add the past_keys
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, seqlen = common_inputs["input_ids"].shape
# Not using the same length for past_key_values
past_key_values_length = seqlen + 2
past_shape = (
batch,
self.num_attention_heads,
past_key_values_length,
self._config.hidden_size // self.num_attention_heads,
)
ordered_inputs["past_key_values"] = [
(torch.zeros(past_shape), torch.zeros(past_shape)) for _ in range(self.num_layers)
]
ordered_inputs["attention_mask"] = common_inputs["attention_mask"]
if self.use_past:
mask_dtype = ordered_inputs["attention_mask"].dtype
ordered_inputs["attention_mask"] = torch.cat(
[ordered_inputs["attention_mask"], torch.ones(batch, past_key_values_length, dtype=mask_dtype)], dim=1
)
return ordered_inputs
@property
def default_onnx_opset(self) -> int:
return 13
__all__ = ["CodeGenConfig", "CodeGenOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""Conditional DETR model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
from ...utils.backbone_utils import verify_backbone_config_arguments
from ..auto import CONFIG_MAPPING, AutoConfig
@ -241,4 +247,25 @@ class ConditionalDetrConfig(PreTrainedConfig):
super().__init__(is_encoder_decoder=is_encoder_decoder, **kwargs)
__all__ = ["ConditionalDetrConfig"]
class ConditionalDetrOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
("pixel_mask", {0: "batch"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-5
@property
def default_onnx_opset(self) -> int:
return 12
__all__ = ["ConditionalDetrConfig", "ConditionalDetrOnnxConfig"]

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""ConvBERT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -136,4 +140,21 @@ class ConvBertConfig(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["ConvBertConfig"]
# Copied from transformers.models.bert.configuration_bert.BertOnnxConfig
class ConvBertOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
("token_type_ids", dynamic_axis),
]
)
__all__ = ["ConvBertConfig", "ConvBertOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""ConvNeXT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_features_output_indices
@ -117,4 +123,20 @@ class ConvNextConfig(BackboneConfigMixin, PreTrainedConfig):
)
__all__ = ["ConvNextConfig"]
class ConvNextOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-5
__all__ = ["ConvNextConfig", "ConvNextOnnxConfig"]

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""Data2VecText configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -124,4 +128,19 @@ class Data2VecTextConfig(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["Data2VecTextConfig"]
class Data2VecTextOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
__all__ = ["Data2VecTextConfig", "Data2VecTextOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""Data2VecVision model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -168,4 +174,21 @@ class Data2VecVisionConfig(PreTrainedConfig):
self.semantic_loss_ignore_index = semantic_loss_ignore_index
__all__ = ["Data2VecVisionConfig"]
# Copied from transformers.models.vit.configuration_vit.ViTOnnxConfig
class Data2VecVisionOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["Data2VecVisionConfig", "Data2VecVisionOnnxConfig"]

View File

@ -14,10 +14,19 @@
# limitations under the License.
"""DeBERTa model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any, Union
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
if TYPE_CHECKING:
from ... import FeatureExtractionMixin, PreTrainedTokenizerBase
logger = logging.get_logger(__name__)
@ -150,4 +159,41 @@ class DebertaConfig(PreTrainedConfig):
self.legacy = legacy
__all__ = ["DebertaConfig"]
# Copied from transformers.models.deberta_v2.configuration_deberta_v2.DebertaV2OnnxConfig
class DebertaOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
if self._config.type_vocab_size > 0:
return OrderedDict(
[("input_ids", dynamic_axis), ("attention_mask", dynamic_axis), ("token_type_ids", dynamic_axis)]
)
else:
return OrderedDict([("input_ids", dynamic_axis), ("attention_mask", dynamic_axis)])
@property
def default_onnx_opset(self) -> int:
return 12
def generate_dummy_inputs(
self,
preprocessor: Union["PreTrainedTokenizerBase", "FeatureExtractionMixin"],
batch_size: int = -1,
seq_length: int = -1,
num_choices: int = -1,
is_pair: bool = False,
num_channels: int = 3,
image_width: int = 40,
image_height: int = 40,
tokenizer: "PreTrainedTokenizerBase" = None,
) -> Mapping[str, Any]:
dummy_inputs = super().generate_dummy_inputs(preprocessor=preprocessor)
if self._config.type_vocab_size == 0 and "token_type_ids" in dummy_inputs:
del dummy_inputs["token_type_ids"]
return dummy_inputs
__all__ = ["DebertaConfig", "DebertaOnnxConfig"]

View File

@ -14,10 +14,19 @@
# limitations under the License.
"""DeBERTa-v2 model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any, Union
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
if TYPE_CHECKING:
from ... import FeatureExtractionMixin, PreTrainedTokenizerBase
logger = logging.get_logger(__name__)
@ -150,4 +159,40 @@ class DebertaV2Config(PreTrainedConfig):
self.legacy = legacy
__all__ = ["DebertaV2Config"]
class DebertaV2OnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
if self._config.type_vocab_size > 0:
return OrderedDict(
[("input_ids", dynamic_axis), ("attention_mask", dynamic_axis), ("token_type_ids", dynamic_axis)]
)
else:
return OrderedDict([("input_ids", dynamic_axis), ("attention_mask", dynamic_axis)])
@property
def default_onnx_opset(self) -> int:
return 12
def generate_dummy_inputs(
self,
preprocessor: Union["PreTrainedTokenizerBase", "FeatureExtractionMixin"],
batch_size: int = -1,
seq_length: int = -1,
num_choices: int = -1,
is_pair: bool = False,
num_channels: int = 3,
image_width: int = 40,
image_height: int = 40,
tokenizer: "PreTrainedTokenizerBase" = None,
) -> Mapping[str, Any]:
dummy_inputs = super().generate_dummy_inputs(preprocessor=preprocessor)
if self._config.type_vocab_size == 0 and "token_type_ids" in dummy_inputs:
del dummy_inputs["token_type_ids"]
return dummy_inputs
__all__ = ["DebertaV2Config", "DebertaV2OnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""DeiT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -125,4 +131,20 @@ class DeiTConfig(PreTrainedConfig):
self.pooler_act = pooler_act
__all__ = ["DeiTConfig"]
class DeiTOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["DeiTConfig", "DeiTOnnxConfig"]

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""MEGA configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ....configuration_utils import PreTrainedConfig
from ....onnx import OnnxConfig
from ....utils import logging
@ -221,4 +225,19 @@ class MegaConfig(PreTrainedConfig):
self.num_attention_heads = 1 # not used but required by Hugging Face
__all__ = ["MegaConfig"]
class MegaOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
__all__ = ["MegaConfig", "MegaOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""DETR model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
from ...utils.backbone_utils import verify_backbone_config_arguments
from ..auto import CONFIG_MAPPING, AutoConfig
@ -240,4 +246,25 @@ class DetrConfig(PreTrainedConfig):
super().__init__(is_encoder_decoder=is_encoder_decoder, **kwargs)
__all__ = ["DetrConfig"]
class DetrOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
("pixel_mask", {0: "batch"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-5
@property
def default_onnx_opset(self) -> int:
return 12
__all__ = ["DetrConfig", "DetrOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""DINOv2 model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_features_output_indices
@ -154,4 +160,20 @@ class Dinov2Config(BackboneConfigMixin, PreTrainedConfig):
self.use_mask_token = use_mask_token
__all__ = ["Dinov2Config"]
class Dinov2OnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["Dinov2Config", "Dinov2OnnxConfig"]

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""DistilBERT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -119,4 +123,19 @@ class DistilBertConfig(PreTrainedConfig):
super().__init__(**kwargs, pad_token_id=pad_token_id)
__all__ = ["DistilBertConfig"]
class DistilBertOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
__all__ = ["DistilBertConfig", "DistilBertOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""EfficientNet model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -144,4 +150,20 @@ class EfficientNetConfig(PreTrainedConfig):
self.num_hidden_layers = sum(num_block_repeats) * 4
__all__ = ["EfficientNetConfig"]
class EfficientNetOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-5
__all__ = ["EfficientNetConfig", "EfficientNetOnnxConfig"]

View File

@ -15,7 +15,11 @@
# limitations under the License.
"""ELECTRA model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -156,4 +160,20 @@ class ElectraConfig(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["ElectraConfig"]
class ElectraOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
("token_type_ids", dynamic_axis),
]
)
__all__ = ["ElectraConfig", "ElectraOnnxConfig"]

View File

@ -15,7 +15,11 @@
# limitations under the License.
"""ERNIE model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -131,4 +135,21 @@ class ErnieConfig(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["ErnieConfig"]
class ErnieOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
("token_type_ids", dynamic_axis),
("task_type_ids", dynamic_axis),
]
)
__all__ = ["ErnieConfig", "ErnieOnnxConfig"]

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""Flaubert configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -213,4 +217,19 @@ class FlaubertConfig(PreTrainedConfig):
super().__init__(pad_token_id=pad_token_id, bos_token_id=bos_token_id, **kwargs)
__all__ = ["FlaubertConfig"]
class FlaubertOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
__all__ = ["FlaubertConfig", "FlaubertOnnxConfig"]

View File

@ -15,7 +15,13 @@
# limitations under the License.
"""OpenAI GPT-2 configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any, Optional
from ... import PreTrainedTokenizer, is_torch_available
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfigWithPast, PatchingSpec
from ...utils import logging
@ -184,4 +190,84 @@ class GPT2Config(PreTrainedConfig):
super().__init__(bos_token_id=bos_token_id, eos_token_id=eos_token_id, **kwargs)
__all__ = ["GPT2Config"]
class GPT2OnnxConfig(OnnxConfigWithPast):
def __init__(
self,
config: PreTrainedConfig,
task: str = "default",
patching_specs: Optional[list[PatchingSpec]] = None,
use_past: bool = False,
):
super().__init__(config, task=task, patching_specs=patching_specs, use_past=use_past)
if not getattr(self._config, "pad_token_id", None):
# TODO: how to do that better?
self._config.pad_token_id = 0
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
common_inputs = OrderedDict({"input_ids": {0: "batch", 1: "sequence"}})
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
common_inputs["attention_mask"] = {0: "batch", 1: "past_sequence + sequence"}
else:
common_inputs["attention_mask"] = {0: "batch", 1: "sequence"}
return common_inputs
@property
def num_layers(self) -> int:
return self._config.n_layer
@property
def num_attention_heads(self) -> int:
return self._config.n_head
def generate_dummy_inputs(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
common_inputs = super(OnnxConfigWithPast, self).generate_dummy_inputs(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
# We need to order the input in the way they appears in the forward()
ordered_inputs = OrderedDict({"input_ids": common_inputs["input_ids"]})
# Need to add the past_keys
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, seqlen = common_inputs["input_ids"].shape
# Not using the same length for past_key_values
past_key_values_length = seqlen + 2
past_shape = (
batch,
self.num_attention_heads,
past_key_values_length,
self._config.hidden_size // self.num_attention_heads,
)
ordered_inputs["past_key_values"] = [
(torch.zeros(past_shape), torch.zeros(past_shape)) for _ in range(self.num_layers)
]
ordered_inputs["attention_mask"] = common_inputs["attention_mask"]
if self.use_past:
mask_dtype = ordered_inputs["attention_mask"].dtype
ordered_inputs["attention_mask"] = torch.cat(
[ordered_inputs["attention_mask"], torch.ones(batch, past_key_values_length, dtype=mask_dtype)], dim=1
)
return ordered_inputs
@property
def default_onnx_opset(self) -> int:
return 13
__all__ = ["GPT2Config", "GPT2OnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""GPT Neo model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any
from ... import PreTrainedTokenizer, is_torch_available
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfigWithPast
from ...utils import logging
@ -199,4 +205,71 @@ def custom_get_block_length_and_num_blocks(seq_length, window_size):
return largest_divisor, torch.div(seq_length, largest_divisor, rounding_mode="floor")
__all__ = ["GPTNeoConfig"]
class GPTNeoOnnxConfig(OnnxConfigWithPast):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
common_inputs = OrderedDict({"input_ids": {0: "batch", 1: "sequence"}})
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
common_inputs["attention_mask"] = {0: "batch", 1: "past_sequence + sequence"}
else:
common_inputs["attention_mask"] = {0: "batch", 1: "sequence"}
return common_inputs
@property
def num_attention_heads(self) -> int:
return self._config.num_heads
def generate_dummy_inputs(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
common_inputs = super(OnnxConfigWithPast, self).generate_dummy_inputs(
tokenizer,
batch_size=batch_size,
seq_length=seq_length,
is_pair=is_pair,
)
# We need to order the input in the way they appears in the forward()
ordered_inputs = OrderedDict({"input_ids": common_inputs["input_ids"]})
# Need to add the past_keys
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, seqlen = common_inputs["input_ids"].shape
# Not using the same length for past_key_values
past_key_values_length = seqlen + 2
past_shape = (
batch,
self.num_attention_heads,
past_key_values_length,
self._config.hidden_size // self.num_attention_heads,
)
ordered_inputs["past_key_values"] = [
(torch.zeros(past_shape), torch.zeros(past_shape)) for _ in range(self.num_layers)
]
ordered_inputs["attention_mask"] = common_inputs["attention_mask"]
if self.use_past:
mask_dtype = ordered_inputs["attention_mask"].dtype
ordered_inputs["attention_mask"] = torch.cat(
[ordered_inputs["attention_mask"], torch.ones(batch, past_key_values_length, dtype=mask_dtype)], dim=1
)
return ordered_inputs
@property
def default_onnx_opset(self) -> int:
return 13
__all__ = ["GPTNeoConfig", "GPTNeoOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""GPT-J model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any, Optional
from ... import PreTrainedTokenizer, is_torch_available
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfigWithPast, PatchingSpec
from ...utils import logging
@ -129,4 +135,85 @@ class GPTJConfig(PreTrainedConfig):
)
__all__ = ["GPTJConfig"]
# Copied from transformers.models.gpt2.configuration_gpt2.GPT2OnnxConfig
class GPTJOnnxConfig(OnnxConfigWithPast):
def __init__(
self,
config: PreTrainedConfig,
task: str = "default",
patching_specs: Optional[list[PatchingSpec]] = None,
use_past: bool = False,
):
super().__init__(config, task=task, patching_specs=patching_specs, use_past=use_past)
if not getattr(self._config, "pad_token_id", None):
# TODO: how to do that better?
self._config.pad_token_id = 0
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
common_inputs = OrderedDict({"input_ids": {0: "batch", 1: "sequence"}})
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
common_inputs["attention_mask"] = {0: "batch", 1: "past_sequence + sequence"}
else:
common_inputs["attention_mask"] = {0: "batch", 1: "sequence"}
return common_inputs
@property
def num_layers(self) -> int:
return self._config.n_layer
@property
def num_attention_heads(self) -> int:
return self._config.n_head
def generate_dummy_inputs(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
common_inputs = super(OnnxConfigWithPast, self).generate_dummy_inputs(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
# We need to order the input in the way they appears in the forward()
ordered_inputs = OrderedDict({"input_ids": common_inputs["input_ids"]})
# Need to add the past_keys
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, seqlen = common_inputs["input_ids"].shape
# Not using the same length for past_key_values
past_key_values_length = seqlen + 2
past_shape = (
batch,
self.num_attention_heads,
past_key_values_length,
self._config.hidden_size // self.num_attention_heads,
)
ordered_inputs["past_key_values"] = [
(torch.zeros(past_shape), torch.zeros(past_shape)) for _ in range(self.num_layers)
]
ordered_inputs["attention_mask"] = common_inputs["attention_mask"]
if self.use_past:
mask_dtype = ordered_inputs["attention_mask"].dtype
ordered_inputs["attention_mask"] = torch.cat(
[ordered_inputs["attention_mask"], torch.ones(batch, past_key_values_length, dtype=mask_dtype)], dim=1
)
return ordered_inputs
@property
def default_onnx_opset(self) -> int:
return 13
__all__ = ["GPTJConfig", "GPTJOnnxConfig"]

View File

@ -14,10 +14,19 @@
# limitations under the License.
"""GroupViT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
if TYPE_CHECKING:
from ...processing_utils import ProcessorMixin
logger = logging.get_logger(__name__)
@ -351,4 +360,52 @@ class GroupViTConfig(PreTrainedConfig):
super().__init__(**kwargs)
__all__ = ["GroupViTConfig", "GroupViTTextConfig", "GroupViTVisionConfig"]
class GroupViTOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
("attention_mask", {0: "batch", 1: "sequence"}),
]
)
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("logits_per_image", {0: "batch"}),
("logits_per_text", {0: "batch"}),
("text_embeds", {0: "batch"}),
("image_embeds", {0: "batch"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
def generate_dummy_inputs(
self,
processor: "ProcessorMixin",
batch_size: int = -1,
seq_length: int = -1,
) -> Mapping[str, Any]:
text_input_dict = super().generate_dummy_inputs(
processor.tokenizer,
batch_size=batch_size,
seq_length=seq_length,
)
image_input_dict = super().generate_dummy_inputs(
processor.image_processor,
batch_size=batch_size,
)
return {**text_input_dict, **image_input_dict}
@property
def default_onnx_opset(self) -> int:
return 14
__all__ = ["GroupViTConfig", "GroupViTOnnxConfig", "GroupViTTextConfig", "GroupViTVisionConfig"]

View File

@ -16,7 +16,11 @@
# limitations under the License.
"""I-BERT configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -112,4 +116,19 @@ class IBertConfig(PreTrainedConfig):
self.force_dequant = force_dequant
__all__ = ["IBertConfig"]
class IBertOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
__all__ = ["IBertConfig", "IBertOnnxConfig"]

View File

@ -14,10 +14,18 @@
# limitations under the License.
"""OpenAI ImageGPT configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
if TYPE_CHECKING:
from ... import FeatureExtractionMixin
logger = logging.get_logger(__name__)
@ -136,4 +144,54 @@ class ImageGPTConfig(PreTrainedConfig):
super().__init__(tie_word_embeddings=tie_word_embeddings, **kwargs)
__all__ = ["ImageGPTConfig"]
class ImageGPTOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
]
)
def generate_dummy_inputs(
self,
preprocessor: "FeatureExtractionMixin",
batch_size: int = 1,
seq_length: int = -1,
is_pair: bool = False,
num_channels: int = 3,
image_width: int = 32,
image_height: int = 32,
) -> Mapping[str, Any]:
"""
Generate inputs to provide to the ONNX exporter.
Args:
preprocessor ([`PreTrainedTokenizerBase`] or [`FeatureExtractionMixin`]):
The preprocessor associated with this model configuration.
batch_size (`int`, *optional*, defaults to -1):
The batch size to export the model for (-1 means dynamic axis).
num_choices (`int`, *optional*, defaults to -1):
The number of candidate answers provided for multiple choice task (-1 means dynamic axis).
seq_length (`int`, *optional*, defaults to -1):
The sequence length to export the model for (-1 means dynamic axis).
is_pair (`bool`, *optional*, defaults to `False`):
Indicate if the input is a pair (sentence 1, sentence 2)
num_channels (`int`, *optional*, defaults to 3):
The number of channels of the generated images.
image_width (`int`, *optional*, defaults to 40):
The width of the generated images.
image_height (`int`, *optional*, defaults to 40):
The height of the generated images.
Returns:
Mapping[str, Tensor] holding the kwargs to provide to the model's forward function
"""
input_image = self._generate_dummy_images(batch_size, num_channels, image_height, image_width)
inputs = dict(preprocessor(images=input_image, return_tensors="pt"))
return inputs
__all__ = ["ImageGPTConfig", "ImageGPTOnnxConfig"]

View File

@ -14,8 +14,13 @@
# limitations under the License.
"""LayoutLM model configuration"""
from ... import PreTrainedConfig
from ...utils import logging
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any, Optional
from ... import PreTrainedConfig, PreTrainedTokenizer
from ...onnx import OnnxConfig, PatchingSpec
from ...utils import is_torch_available, logging
logger = logging.get_logger(__name__)
@ -122,4 +127,64 @@ class LayoutLMConfig(PreTrainedConfig):
self.max_2d_position_embeddings = max_2d_position_embeddings
__all__ = ["LayoutLMConfig"]
class LayoutLMOnnxConfig(OnnxConfig):
def __init__(
self,
config: PreTrainedConfig,
task: str = "default",
patching_specs: Optional[list[PatchingSpec]] = None,
):
super().__init__(config, task=task, patching_specs=patching_specs)
self.max_2d_positions = config.max_2d_position_embeddings - 1
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
("bbox", {0: "batch", 1: "sequence"}),
("attention_mask", {0: "batch", 1: "sequence"}),
("token_type_ids", {0: "batch", 1: "sequence"}),
]
)
def generate_dummy_inputs(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
"""
Generate inputs to provide to the ONNX exporter
Args:
tokenizer: The tokenizer associated with this model configuration
batch_size: The batch size (int) to export the model for (-1 means dynamic axis)
seq_length: The sequence length (int) to export the model for (-1 means dynamic axis)
is_pair: Indicate if the input is a pair (sentence 1, sentence 2)
Returns:
Mapping[str, Tensor] holding the kwargs to provide to the model's forward function
"""
input_dict = super().generate_dummy_inputs(
tokenizer,
batch_size=batch_size,
seq_length=seq_length,
is_pair=is_pair,
)
# Generate a dummy bbox
box = [48, 84, 73, 128]
if not is_torch_available():
raise ValueError("Cannot generate dummy inputs without PyTorch installed.")
import torch
batch_size, seq_length = input_dict["input_ids"].shape
input_dict["bbox"] = torch.tensor([*[box] * seq_length]).tile(batch_size, 1, 1)
return input_dict
__all__ = ["LayoutLMConfig", "LayoutLMOnnxConfig"]

View File

@ -14,10 +14,22 @@
# limitations under the License.
"""LayoutLMv3 model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...onnx.utils import compute_effective_axis_dimension
from ...utils import logging
if TYPE_CHECKING:
from ...processing_utils import ProcessorMixin
logger = logging.get_logger(__name__)
@ -175,4 +187,104 @@ class LayoutLMv3Config(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["LayoutLMv3Config"]
class LayoutLMv3OnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.12")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
# The order of inputs is different for question answering and sequence classification
if self.task in ["question-answering", "sequence-classification"]:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
("attention_mask", {0: "batch", 1: "sequence"}),
("bbox", {0: "batch", 1: "sequence"}),
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
else:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
("bbox", {0: "batch", 1: "sequence"}),
("attention_mask", {0: "batch", 1: "sequence"}),
("pixel_values", {0: "batch", 1: "num_channels"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-5
@property
def default_onnx_opset(self) -> int:
return 12
def generate_dummy_inputs(
self,
processor: "ProcessorMixin",
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
num_channels: int = 3,
image_width: int = 40,
image_height: int = 40,
) -> Mapping[str, Any]:
"""
Generate inputs to provide to the ONNX exporter
Args:
processor ([`ProcessorMixin`]):
The processor associated with this model configuration.
batch_size (`int`, *optional*, defaults to -1):
The batch size to export the model for (-1 means dynamic axis).
seq_length (`int`, *optional*, defaults to -1):
The sequence length to export the model for (-1 means dynamic axis).
is_pair (`bool`, *optional*, defaults to `False`):
Indicate if the input is a pair (sentence 1, sentence 2).
num_channels (`int`, *optional*, defaults to 3):
The number of channels of the generated images.
image_width (`int`, *optional*, defaults to 40):
The width of the generated images.
image_height (`int`, *optional*, defaults to 40):
The height of the generated images.
Returns:
Mapping[str, Any]: holding the kwargs to provide to the model's forward function
"""
# A dummy image is used so OCR should not be applied
setattr(processor.image_processor, "apply_ocr", False)
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(
batch_size, fixed_dimension=OnnxConfig.default_fixed_batch, num_token_to_add=0
)
# If dynamic axis (-1) we forward with a fixed dimension of 8 tokens to avoid optimizations made by ONNX
token_to_add = processor.tokenizer.num_special_tokens_to_add(is_pair)
seq_length = compute_effective_axis_dimension(
seq_length, fixed_dimension=OnnxConfig.default_fixed_sequence, num_token_to_add=token_to_add
)
# Generate dummy inputs according to compute batch and sequence
dummy_text = [[" ".join([processor.tokenizer.unk_token]) * seq_length]] * batch_size
# Generate dummy bounding boxes
dummy_bboxes = [[[48, 84, 73, 128]]] * batch_size
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
# batch_size = compute_effective_axis_dimension(batch_size, fixed_dimension=OnnxConfig.default_fixed_batch)
dummy_image = self._generate_dummy_images(batch_size, num_channels, image_height, image_width)
inputs = dict(
processor(
dummy_image,
text=dummy_text,
boxes=dummy_bboxes,
return_tensors="pt",
)
)
return inputs
__all__ = ["LayoutLMv3Config", "LayoutLMv3OnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""LeViT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -118,4 +124,21 @@ class LevitConfig(PreTrainedConfig):
]
__all__ = ["LevitConfig"]
# Copied from transformers.models.vit.configuration_vit.ViTOnnxConfig
class LevitOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["LevitConfig", "LevitOnnxConfig"]

View File

@ -14,12 +14,20 @@
# limitations under the License.
"""Longformer configuration"""
from typing import Union
from collections import OrderedDict
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any, Optional, Union
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
if TYPE_CHECKING:
from ...onnx.config import PatchingSpec
from ...tokenization_utils_base import PreTrainedTokenizerBase
logger = logging.get_logger(__name__)
@ -131,4 +139,71 @@ class LongformerConfig(PreTrainedConfig):
self.onnx_export = onnx_export
__all__ = ["LongformerConfig"]
class LongformerOnnxConfig(OnnxConfig):
def __init__(
self, config: "PreTrainedConfig", task: str = "default", patching_specs: "Optional[list[PatchingSpec]]" = None
):
super().__init__(config, task, patching_specs)
config.onnx_export = True
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
("global_attention_mask", dynamic_axis),
]
)
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
outputs = super().outputs
if self.task == "default":
outputs["pooler_output"] = {0: "batch"}
return outputs
@property
def atol_for_validation(self) -> float:
"""
What absolute tolerance value to use during model conversion validation.
Returns:
Float absolute tolerance value.
"""
return 1e-4
@property
def default_onnx_opset(self) -> int:
# needs to be >= 14 to support tril operator
return max(super().default_onnx_opset, 14)
def generate_dummy_inputs(
self,
tokenizer: "PreTrainedTokenizerBase",
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
inputs = super().generate_dummy_inputs(
preprocessor=tokenizer,
batch_size=batch_size,
seq_length=seq_length,
is_pair=is_pair,
)
import torch
# for some reason, replacing this code by inputs["global_attention_mask"] = torch.randint(2, inputs["input_ids"].shape, dtype=torch.int64)
# makes the export fail randomly
inputs["global_attention_mask"] = torch.zeros_like(inputs["input_ids"])
# make every second token global
inputs["global_attention_mask"][:, ::2] = 1
return inputs
__all__ = ["LongformerConfig", "LongformerOnnxConfig"]

View File

@ -14,7 +14,10 @@
# limitations under the License.
"""LongT5 model configuration"""
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxSeq2SeqConfigWithPast
from ...utils import logging
@ -149,4 +152,29 @@ class LongT5Config(PreTrainedConfig):
)
__all__ = ["LongT5Config"]
class LongT5OnnxConfig(OnnxSeq2SeqConfigWithPast):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
common_inputs = {
"input_ids": {0: "batch", 1: "encoder_sequence"},
"attention_mask": {0: "batch", 1: "encoder_sequence"},
}
if self.use_past:
common_inputs["attention_mask"][1] = "past_encoder_sequence + sequence"
common_inputs["decoder_input_ids"] = {0: "batch"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
else:
common_inputs["decoder_input_ids"] = {0: "batch", 1: "decoder_sequence"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "decoder_sequence"}
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
return common_inputs
@property
def default_onnx_opset(self) -> int:
return 13
__all__ = ["LongT5Config", "LongT5OnnxConfig"]

View File

@ -14,8 +14,15 @@
# limitations under the License.
"""M2M100 model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any
from ... import PreTrainedTokenizer
from ...configuration_utils import PreTrainedConfig
from ...utils import logging
from ...onnx import OnnxConfig, OnnxSeq2SeqConfigWithPast
from ...onnx.utils import compute_effective_axis_dimension
from ...utils import is_torch_available, logging
logger = logging.get_logger(__name__)
@ -151,4 +158,125 @@ class M2M100Config(PreTrainedConfig):
)
__all__ = ["M2M100Config"]
class M2M100OnnxConfig(OnnxSeq2SeqConfigWithPast):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
common_inputs["decoder_input_ids"] = {0: "batch"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
else:
common_inputs["decoder_input_ids"] = {0: "batch", 1: "decoder_sequence"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "decoder_sequence"}
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
return common_inputs
# Copied from BartOnnxConfig._generate_dummy_inputs_for_sequence_classification_and_question_answering
# A better name would be _generate_dummy_inputs_for_encoder_and_decoder because sequence classification and question
# answering are not supported for M2M100, but this name is preserved to be able to check that the copy matches what
# was done for BART so that it can be updated if need be.
def _generate_dummy_inputs_for_sequence_classification_and_question_answering(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
# Copied from OnnxConfig.generate_dummy_inputs
# Did not use super(OnnxConfigWithPast, self).generate_dummy_inputs for code clarity.
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(
batch_size, fixed_dimension=OnnxConfig.default_fixed_batch, num_token_to_add=0
)
# If dynamic axis (-1) we forward with a fixed dimension of 8 tokens to avoid optimizations made by ONNX
token_to_add = tokenizer.num_special_tokens_to_add(is_pair)
seq_length = compute_effective_axis_dimension(
seq_length, fixed_dimension=OnnxConfig.default_fixed_sequence, num_token_to_add=token_to_add
)
# Generate dummy inputs according to compute batch and sequence
dummy_input = [" ".join([tokenizer.unk_token]) * seq_length] * batch_size
common_inputs = dict(tokenizer(dummy_input, return_tensors="pt"))
return common_inputs
# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig._generate_dummy_inputs_for_default_and_seq2seq_lm
def _generate_dummy_inputs_for_default_and_seq2seq_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
encoder_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, seq_length, is_pair
)
# Generate decoder inputs
decoder_seq_length = seq_length if not self.use_past else 1
decoder_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, decoder_seq_length, is_pair
)
decoder_inputs = {f"decoder_{name}": tensor for name, tensor in decoder_inputs.items()}
common_inputs = dict(**encoder_inputs, **decoder_inputs)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, encoder_seq_length = common_inputs["input_ids"].shape
decoder_seq_length = common_inputs["decoder_input_ids"].shape[1]
num_encoder_attention_heads, num_decoder_attention_heads = self.num_attention_heads
encoder_shape = (
batch,
num_encoder_attention_heads,
encoder_seq_length,
self._config.hidden_size // num_encoder_attention_heads,
)
decoder_past_length = decoder_seq_length + 3
decoder_shape = (
batch,
num_decoder_attention_heads,
decoder_past_length,
self._config.hidden_size // num_decoder_attention_heads,
)
common_inputs["decoder_attention_mask"] = torch.cat(
[common_inputs["decoder_attention_mask"], torch.ones(batch, decoder_past_length)], dim=1
)
common_inputs["past_key_values"] = []
# If the number of encoder and decoder layers are present in the model configuration, both are considered
num_encoder_layers, num_decoder_layers = self.num_layers
min_num_layers = min(num_encoder_layers, num_decoder_layers)
max_num_layers = max(num_encoder_layers, num_decoder_layers) - min_num_layers
remaining_side_name = "encoder" if num_encoder_layers > num_decoder_layers else "decoder"
for _ in range(min_num_layers):
common_inputs["past_key_values"].append(
(
torch.zeros(decoder_shape),
torch.zeros(decoder_shape),
torch.zeros(encoder_shape),
torch.zeros(encoder_shape),
)
)
# TODO: test this.
shape = encoder_shape if remaining_side_name == "encoder" else decoder_shape
for _ in range(min_num_layers, max_num_layers):
common_inputs["past_key_values"].append((torch.zeros(shape), torch.zeros(shape)))
return common_inputs
generate_dummy_inputs = _generate_dummy_inputs_for_default_and_seq2seq_lm
__all__ = ["M2M100Config", "M2M100OnnxConfig"]

View File

@ -14,8 +14,15 @@
# limitations under the License.
"""Marian model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any
from ... import PreTrainedTokenizer
from ...configuration_utils import PreTrainedConfig
from ...utils import logging
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
from ...onnx.utils import compute_effective_axis_dimension
from ...utils import is_torch_available, logging
logger = logging.get_logger(__name__)
@ -157,4 +164,243 @@ class MarianConfig(PreTrainedConfig):
)
__all__ = ["MarianConfig"]
class MarianOnnxConfig(OnnxSeq2SeqConfigWithPast):
@property
# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig.inputs
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task in ["default", "seq2seq-lm"]:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
common_inputs["decoder_input_ids"] = {0: "batch"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
else:
common_inputs["decoder_input_ids"] = {0: "batch", 1: "decoder_sequence"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "decoder_sequence"}
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
elif self.task == "causal-lm":
# TODO: figure this case out.
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
num_encoder_layers, _ = self.num_layers
for i in range(num_encoder_layers):
common_inputs[f"past_key_values.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
common_inputs[f"past_key_values.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
else:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
("decoder_input_ids", {0: "batch", 1: "decoder_sequence"}),
("decoder_attention_mask", {0: "batch", 1: "decoder_sequence"}),
]
)
return common_inputs
@property
# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig.outputs
def outputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task in ["default", "seq2seq-lm"]:
common_outputs = super().outputs
else:
common_outputs = super(OnnxConfigWithPast, self).outputs
if self.use_past:
num_encoder_layers, _ = self.num_layers
for i in range(num_encoder_layers):
common_outputs[f"present.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
common_outputs[f"present.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
return common_outputs
def _generate_dummy_inputs_for_default_and_seq2seq_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
encoder_inputs = self._generate_dummy_inputs_for_encoder_and_decoder(
tokenizer,
batch_size,
seq_length,
is_pair,
)
# Generate decoder inputs
decoder_seq_length = seq_length if not self.use_past else 1
decoder_inputs = self._generate_dummy_inputs_for_encoder_and_decoder(
tokenizer,
batch_size,
decoder_seq_length,
is_pair,
)
decoder_inputs = {f"decoder_{name}": tensor for name, tensor in decoder_inputs.items()}
common_inputs = dict(**encoder_inputs, **decoder_inputs)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, encoder_seq_length = common_inputs["input_ids"].shape
decoder_seq_length = common_inputs["decoder_input_ids"].shape[1]
num_encoder_attention_heads, num_decoder_attention_heads = self.num_attention_heads
encoder_shape = (
batch,
num_encoder_attention_heads,
encoder_seq_length,
self._config.hidden_size // num_encoder_attention_heads,
)
decoder_past_length = decoder_seq_length + 3
decoder_shape = (
batch,
num_decoder_attention_heads,
decoder_past_length,
self._config.hidden_size // num_decoder_attention_heads,
)
common_inputs["decoder_attention_mask"] = torch.cat(
[common_inputs["decoder_attention_mask"], torch.ones(batch, decoder_past_length)], dim=1
)
common_inputs["past_key_values"] = []
# If the number of encoder and decoder layers are present in the model configuration, both are considered
num_encoder_layers, num_decoder_layers = self.num_layers
min_num_layers = min(num_encoder_layers, num_decoder_layers)
max_num_layers = max(num_encoder_layers, num_decoder_layers) - min_num_layers
remaining_side_name = "encoder" if num_encoder_layers > num_decoder_layers else "decoder"
for _ in range(min_num_layers):
common_inputs["past_key_values"].append(
(
torch.zeros(decoder_shape),
torch.zeros(decoder_shape),
torch.zeros(encoder_shape),
torch.zeros(encoder_shape),
)
)
# TODO: test this.
shape = encoder_shape if remaining_side_name == "encoder" else decoder_shape
for _ in range(min_num_layers, max_num_layers):
common_inputs["past_key_values"].append((torch.zeros(shape), torch.zeros(shape)))
return common_inputs
def _generate_dummy_inputs_for_causal_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
common_inputs = self._generate_dummy_inputs_for_encoder_and_decoder(
tokenizer,
batch_size,
seq_length,
is_pair,
)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, seqlen = common_inputs["input_ids"].shape
# Not using the same length for past_key_values
past_key_values_length = seqlen + 2
num_encoder_layers, _ = self.num_layers
num_encoder_attention_heads, _ = self.num_attention_heads
past_shape = (
batch,
num_encoder_attention_heads,
past_key_values_length,
self._config.hidden_size // num_encoder_attention_heads,
)
mask_dtype = common_inputs["attention_mask"].dtype
common_inputs["attention_mask"] = torch.cat(
[common_inputs["attention_mask"], torch.ones(batch, past_key_values_length, dtype=mask_dtype)], dim=1
)
common_inputs["past_key_values"] = [
(torch.zeros(past_shape), torch.zeros(past_shape)) for _ in range(num_encoder_layers)
]
return common_inputs
# Copied from BartOnnxConfig._generate_dummy_inputs_for_sequence_classification_and_question_answering
# We renamed this function because Marian models do not have a sequence classification or question answering head
def _generate_dummy_inputs_for_encoder_and_decoder(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
# Copied from OnnxConfig.generate_dummy_inputs
# Did not use super(OnnxConfigWithPast, self).generate_dummy_inputs for code clarity.
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(
batch_size, fixed_dimension=OnnxConfig.default_fixed_batch, num_token_to_add=0
)
# If dynamic axis (-1) we forward with a fixed dimension of 8 tokens to avoid optimizations made by ONNX
token_to_add = tokenizer.num_special_tokens_to_add(is_pair)
seq_length = compute_effective_axis_dimension(
seq_length, fixed_dimension=OnnxConfig.default_fixed_sequence, num_token_to_add=token_to_add
)
# Generate dummy inputs according to compute batch and sequence
dummy_input = [" ".join([tokenizer.unk_token]) * seq_length] * batch_size
common_inputs = dict(tokenizer(dummy_input, return_tensors="pt"))
return common_inputs
def generate_dummy_inputs(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
if self.task in ["default", "seq2seq-lm"]:
common_inputs = self._generate_dummy_inputs_for_default_and_seq2seq_lm(
tokenizer,
batch_size=batch_size,
seq_length=seq_length,
is_pair=is_pair,
)
else:
common_inputs = self._generate_dummy_inputs_for_causal_lm(
tokenizer,
batch_size=batch_size,
seq_length=seq_length,
is_pair=is_pair,
)
return common_inputs
# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig._flatten_past_key_values_
def _flatten_past_key_values_(self, flattened_output, name, idx, t):
if self.task in ["default", "seq2seq-lm"]:
flattened_output = super()._flatten_past_key_values_(flattened_output, name, idx, t)
else:
flattened_output = super(OnnxSeq2SeqConfigWithPast, self)._flatten_past_key_values_(
flattened_output, name, idx, t
)
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["MarianConfig", "MarianOnnxConfig"]

View File

@ -14,8 +14,15 @@
# limitations under the License.
"""MBART model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any
from ... import PreTrainedTokenizer
from ...configuration_utils import PreTrainedConfig
from ...utils import logging
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
from ...onnx.utils import compute_effective_axis_dimension
from ...utils import is_torch_available, logging
logger = logging.get_logger(__name__)
@ -157,4 +164,224 @@ class MBartConfig(PreTrainedConfig):
)
__all__ = ["MBartConfig"]
# Copied from transformers.models.bart.configuration_bart.BartOnnxConfig with Bart->MBart
class MBartOnnxConfig(OnnxSeq2SeqConfigWithPast):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task in ["default", "seq2seq-lm"]:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
common_inputs["decoder_input_ids"] = {0: "batch"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
else:
common_inputs["decoder_input_ids"] = {0: "batch", 1: "decoder_sequence"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "decoder_sequence"}
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
elif self.task == "causal-lm":
# TODO: figure this case out.
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
]
)
if self.use_past:
num_encoder_layers, _ = self.num_layers
for i in range(num_encoder_layers):
common_inputs[f"past_key_values.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
common_inputs[f"past_key_values.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
else:
common_inputs = OrderedDict(
[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
("decoder_input_ids", {0: "batch", 1: "decoder_sequence"}),
("decoder_attention_mask", {0: "batch", 1: "decoder_sequence"}),
]
)
return common_inputs
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task in ["default", "seq2seq-lm"]:
common_outputs = super().outputs
else:
common_outputs = super(OnnxConfigWithPast, self).outputs
if self.use_past:
num_encoder_layers, _ = self.num_layers
for i in range(num_encoder_layers):
common_outputs[f"present.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
common_outputs[f"present.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
return common_outputs
def _generate_dummy_inputs_for_default_and_seq2seq_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
encoder_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, seq_length, is_pair
)
# Generate decoder inputs
decoder_seq_length = seq_length if not self.use_past else 1
decoder_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, decoder_seq_length, is_pair
)
decoder_inputs = {f"decoder_{name}": tensor for name, tensor in decoder_inputs.items()}
common_inputs = dict(**encoder_inputs, **decoder_inputs)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, encoder_seq_length = common_inputs["input_ids"].shape
decoder_seq_length = common_inputs["decoder_input_ids"].shape[1]
num_encoder_attention_heads, num_decoder_attention_heads = self.num_attention_heads
encoder_shape = (
batch,
num_encoder_attention_heads,
encoder_seq_length,
self._config.hidden_size // num_encoder_attention_heads,
)
decoder_past_length = decoder_seq_length + 3
decoder_shape = (
batch,
num_decoder_attention_heads,
decoder_past_length,
self._config.hidden_size // num_decoder_attention_heads,
)
common_inputs["decoder_attention_mask"] = torch.cat(
[common_inputs["decoder_attention_mask"], torch.ones(batch, decoder_past_length)], dim=1
)
common_inputs["past_key_values"] = []
# If the number of encoder and decoder layers are present in the model configuration, both are considered
num_encoder_layers, num_decoder_layers = self.num_layers
min_num_layers = min(num_encoder_layers, num_decoder_layers)
max_num_layers = max(num_encoder_layers, num_decoder_layers) - min_num_layers
remaining_side_name = "encoder" if num_encoder_layers > num_decoder_layers else "decoder"
for _ in range(min_num_layers):
common_inputs["past_key_values"].append(
(
torch.zeros(decoder_shape),
torch.zeros(decoder_shape),
torch.zeros(encoder_shape),
torch.zeros(encoder_shape),
)
)
# TODO: test this.
shape = encoder_shape if remaining_side_name == "encoder" else decoder_shape
for _ in range(min_num_layers, max_num_layers):
common_inputs["past_key_values"].append((torch.zeros(shape), torch.zeros(shape)))
return common_inputs
def _generate_dummy_inputs_for_causal_lm(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
common_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size, seq_length, is_pair
)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, seqlen = common_inputs["input_ids"].shape
# Not using the same length for past_key_values
past_key_values_length = seqlen + 2
num_encoder_layers, _ = self.num_layers
num_encoder_attention_heads, _ = self.num_attention_heads
past_shape = (
batch,
num_encoder_attention_heads,
past_key_values_length,
self._config.hidden_size // num_encoder_attention_heads,
)
mask_dtype = common_inputs["attention_mask"].dtype
common_inputs["attention_mask"] = torch.cat(
[common_inputs["attention_mask"], torch.ones(batch, past_key_values_length, dtype=mask_dtype)], dim=1
)
common_inputs["past_key_values"] = [
(torch.zeros(past_shape), torch.zeros(past_shape)) for _ in range(num_encoder_layers)
]
return common_inputs
def _generate_dummy_inputs_for_sequence_classification_and_question_answering(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
# Copied from OnnxConfig.generate_dummy_inputs
# Did not use super(OnnxConfigWithPast, self).generate_dummy_inputs for code clarity.
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(
batch_size, fixed_dimension=OnnxConfig.default_fixed_batch, num_token_to_add=0
)
# If dynamic axis (-1) we forward with a fixed dimension of 8 tokens to avoid optimizations made by ONNX
token_to_add = tokenizer.num_special_tokens_to_add(is_pair)
seq_length = compute_effective_axis_dimension(
seq_length, fixed_dimension=OnnxConfig.default_fixed_sequence, num_token_to_add=token_to_add
)
# Generate dummy inputs according to compute batch and sequence
dummy_input = [" ".join([tokenizer.unk_token]) * seq_length] * batch_size
common_inputs = dict(tokenizer(dummy_input, return_tensors="pt"))
return common_inputs
def generate_dummy_inputs(
self,
tokenizer: PreTrainedTokenizer,
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
if self.task in ["default", "seq2seq-lm"]:
common_inputs = self._generate_dummy_inputs_for_default_and_seq2seq_lm(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
elif self.task == "causal-lm":
common_inputs = self._generate_dummy_inputs_for_causal_lm(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
else:
common_inputs = self._generate_dummy_inputs_for_sequence_classification_and_question_answering(
tokenizer, batch_size=batch_size, seq_length=seq_length, is_pair=is_pair
)
return common_inputs
def _flatten_past_key_values_(self, flattened_output, name, idx, t):
if self.task in ["default", "seq2seq-lm"]:
flattened_output = super()._flatten_past_key_values_(flattened_output, name, idx, t)
else:
flattened_output = super(OnnxSeq2SeqConfigWithPast, self)._flatten_past_key_values_(
flattened_output, name, idx, t
)
__all__ = ["MBartConfig", "MBartOnnxConfig"]

View File

@ -4,6 +4,7 @@
# the file from the modular. If any change should be done, please apply the change to the
# modular_metaclip_2.py file directly. One of our CI enforces this.
# 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
from ...configuration_utils import PreTrainedConfig
from ...utils import logging

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""MobileBERT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -160,4 +164,21 @@ class MobileBertConfig(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["MobileBertConfig"]
# Copied from transformers.models.bert.configuration_bert.BertOnnxConfig with Bert->MobileBert
class MobileBertOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
("token_type_ids", dynamic_axis),
]
)
__all__ = ["MobileBertConfig", "MobileBertOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""MobileNetV1 model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -98,4 +104,23 @@ class MobileNetV1Config(PreTrainedConfig):
self.layer_norm_eps = layer_norm_eps
__all__ = ["MobileNetV1Config"]
class MobileNetV1OnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict([("pixel_values", {0: "batch"})])
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "image-classification":
return OrderedDict([("logits", {0: "batch"})])
else:
return OrderedDict([("last_hidden_state", {0: "batch"}), ("pooler_output", {0: "batch"})])
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["MobileNetV1Config", "MobileNetV1OnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""MobileNetV2 model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -126,4 +132,23 @@ class MobileNetV2Config(PreTrainedConfig):
self.semantic_loss_ignore_index = semantic_loss_ignore_index
__all__ = ["MobileNetV2Config"]
class MobileNetV2OnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict([("pixel_values", {0: "batch"})])
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "image-classification":
return OrderedDict([("logits", {0: "batch"})])
else:
return OrderedDict([("last_hidden_state", {0: "batch"}), ("pooler_output", {0: "batch"})])
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["MobileNetV2Config", "MobileNetV2OnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""MobileViT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -144,4 +150,23 @@ class MobileViTConfig(PreTrainedConfig):
self.semantic_loss_ignore_index = semantic_loss_ignore_index
__all__ = ["MobileViTConfig"]
class MobileViTOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict([("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"})])
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "image-classification":
return OrderedDict([("logits", {0: "batch"})])
else:
return OrderedDict([("last_hidden_state", {0: "batch"}), ("pooler_output", {0: "batch"})])
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["MobileViTConfig", "MobileViTOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""MobileViTV2 model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -140,4 +146,23 @@ class MobileViTV2Config(PreTrainedConfig):
self.semantic_loss_ignore_index = semantic_loss_ignore_index
__all__ = ["MobileViTV2Config"]
class MobileViTV2OnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict([("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"})])
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "image-classification":
return OrderedDict([("logits", {0: "batch"})])
else:
return OrderedDict([("last_hidden_state", {0: "batch"}), ("pooler_output", {0: "batch"})])
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["MobileViTV2Config", "MobileViTV2OnnxConfig"]

View File

@ -14,7 +14,10 @@
# limitations under the License.
"""mT5 model configuration"""
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxSeq2SeqConfigWithPast
from ...utils import logging
@ -145,4 +148,35 @@ class MT5Config(PreTrainedConfig):
)
__all__ = ["MT5Config"]
class MT5OnnxConfig(OnnxSeq2SeqConfigWithPast):
@property
# Copied from transformers.models.t5.configuration_t5.T5OnnxConfig.inputs
def inputs(self) -> Mapping[str, Mapping[int, str]]:
common_inputs = {
"input_ids": {0: "batch", 1: "encoder_sequence"},
"attention_mask": {0: "batch", 1: "encoder_sequence"},
}
if self.use_past:
common_inputs["attention_mask"][1] = "past_encoder_sequence + sequence"
common_inputs["decoder_input_ids"] = {0: "batch"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
else:
common_inputs["decoder_input_ids"] = {0: "batch", 1: "decoder_sequence"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "decoder_sequence"}
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
return common_inputs
@property
# Copied from transformers.models.t5.configuration_t5.T5OnnxConfig.default_onnx_opset
def default_onnx_opset(self) -> int:
return 13
@property
def atol_for_validation(self) -> float:
return 5e-4
__all__ = ["MT5Config", "MT5OnnxConfig"]

View File

@ -14,7 +14,16 @@
# limitations under the License.
"""OWL-ViT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any
if TYPE_CHECKING:
from ...processing_utils import ProcessorMixin
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -265,4 +274,52 @@ class OwlViTConfig(PreTrainedConfig):
super().__init__(**kwargs)
__all__ = ["OwlViTConfig", "OwlViTTextConfig", "OwlViTVisionConfig"]
class OwlViTOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
("attention_mask", {0: "batch", 1: "sequence"}),
]
)
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("logits_per_image", {0: "batch"}),
("logits_per_text", {0: "batch"}),
("text_embeds", {0: "batch"}),
("image_embeds", {0: "batch"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
def generate_dummy_inputs(
self,
processor: "ProcessorMixin",
batch_size: int = -1,
seq_length: int = -1,
) -> Mapping[str, Any]:
text_input_dict = super().generate_dummy_inputs(
processor.tokenizer,
batch_size=batch_size,
seq_length=seq_length,
)
image_input_dict = super().generate_dummy_inputs(
processor.image_processor,
batch_size=batch_size,
)
return {**text_input_dict, **image_input_dict}
@property
def default_onnx_opset(self) -> int:
return 14
__all__ = ["OwlViTConfig", "OwlViTOnnxConfig", "OwlViTTextConfig", "OwlViTVisionConfig"]

View File

@ -14,7 +14,15 @@
# limitations under the License.
"""Perceiver model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import Any, Union
from ...configuration_utils import PreTrainedConfig
from ...feature_extraction_utils import FeatureExtractionMixin
from ...onnx import OnnxConfig
from ...onnx.utils import compute_effective_axis_dimension
from ...tokenization_utils_base import PreTrainedTokenizerBase
from ...utils import logging
@ -174,4 +182,63 @@ class PerceiverConfig(PreTrainedConfig):
self._label_trainable_num_channels = _label_trainable_num_channels
__all__ = ["PerceiverConfig"]
class PerceiverOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("inputs", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
def generate_dummy_inputs(
self,
preprocessor: Union["PreTrainedTokenizerBase", "FeatureExtractionMixin"],
batch_size: int = -1,
seq_length: int = -1,
num_choices: int = -1,
is_pair: bool = False,
num_channels: int = 3,
image_width: int = 40,
image_height: int = 40,
) -> Mapping[str, Any]:
# copied from `transformers.onnx.config.OnnxConfig` and slightly altered/simplified
if isinstance(preprocessor, PreTrainedTokenizerBase):
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(
batch_size, fixed_dimension=OnnxConfig.default_fixed_batch, num_token_to_add=0
)
# If dynamic axis (-1) we forward with a fixed dimension of 8 tokens to avoid optimizations made by ONNX
token_to_add = preprocessor.num_special_tokens_to_add(is_pair)
seq_length = compute_effective_axis_dimension(
seq_length, fixed_dimension=OnnxConfig.default_fixed_sequence, num_token_to_add=token_to_add
)
# Generate dummy inputs according to compute batch and sequence
dummy_input = [" ".join(["a"]) * seq_length] * batch_size
inputs = dict(preprocessor(dummy_input, return_tensors="pt"))
inputs["inputs"] = inputs.pop("input_ids")
return inputs
elif isinstance(preprocessor, FeatureExtractionMixin) and preprocessor.model_input_names[0] == "pixel_values":
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(batch_size, fixed_dimension=OnnxConfig.default_fixed_batch)
dummy_input = self._generate_dummy_images(batch_size, num_channels, image_height, image_width)
inputs = dict(preprocessor(images=dummy_input, return_tensors="pt"))
inputs["inputs"] = inputs.pop("pixel_values")
return inputs
else:
raise ValueError(
"Unable to generate dummy inputs for the model. Please provide a tokenizer or a preprocessor."
)
__all__ = ["PerceiverConfig", "PerceiverOnnxConfig"]

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""PLBART model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfigWithPast
from ...utils import logging
@ -161,4 +165,33 @@ class PLBartConfig(PreTrainedConfig):
)
class PLBartOnnxConfig(OnnxConfigWithPast):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
("attention_mask", {0: "batch", 1: "sequence"}),
]
)
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
if self.use_past:
return OrderedDict(
[
("last_hidden_state", {0: "batch", 1: "sequence"}),
("past_keys", {0: "batch", 2: "sequence"}),
("encoder_last_hidden_state", {0: "batch", 1: "sequence"}),
]
)
else:
return OrderedDict(
[
("last_hidden_state", {0: "batch", 1: "sequence"}),
("encoder_last_hidden_state", {0: "batch", 1: "sequence"}),
]
)
__all__ = ["PLBartConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""PoolFormer model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -123,4 +129,20 @@ class PoolFormerConfig(PreTrainedConfig):
super().__init__(**kwargs)
__all__ = ["PoolFormerConfig"]
class PoolFormerOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 2e-3
__all__ = ["PoolFormerConfig", "PoolFormerOnnxConfig"]

View File

@ -16,9 +16,13 @@
# limitations under the License.
"""Pvt model configuration"""
from collections import OrderedDict
from collections.abc import Callable, Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -135,4 +139,24 @@ class PvtConfig(PreTrainedConfig):
self.qkv_bias = qkv_bias
__all__ = ["PvtConfig"]
class PvtOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
@property
def default_onnx_opset(self) -> int:
return 12
__all__ = ["PvtConfig", "PvtOnnxConfig"]

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""RemBERT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -135,4 +139,24 @@ class RemBertConfig(PreTrainedConfig):
self.tie_word_embeddings = False
__all__ = ["RemBertConfig"]
class RemBertOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
("token_type_ids", dynamic_axis),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["RemBertConfig", "RemBertOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""ResNet model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_features_output_indices
@ -111,4 +117,20 @@ class ResNetConfig(BackboneConfigMixin, PreTrainedConfig):
)
__all__ = ["ResNetConfig"]
class ResNetOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-3
__all__ = ["ResNetConfig", "ResNetOnnxConfig"]

View File

@ -15,7 +15,11 @@
# limitations under the License.
"""RoBERTa configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -125,4 +129,19 @@ class RobertaConfig(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["RobertaConfig"]
class RobertaOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
__all__ = ["RobertaConfig", "RobertaOnnxConfig"]

View File

@ -15,7 +15,11 @@
# limitations under the License.
"""RoBERTa-PreLayerNorm configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -126,4 +130,20 @@ class RobertaPreLayerNormConfig(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["RobertaPreLayerNormConfig"]
# Copied from transformers.models.roberta.configuration_roberta.RobertaOnnxConfig with Roberta->RobertaPreLayerNorm
class RobertaPreLayerNormOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
__all__ = ["RobertaPreLayerNormConfig", "RobertaPreLayerNormOnnxConfig"]

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""RoFormer model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -126,4 +130,21 @@ class RoFormerConfig(PreTrainedConfig):
self.use_cache = use_cache
__all__ = ["RoFormerConfig"]
class RoFormerOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
("token_type_ids", dynamic_axis),
]
)
__all__ = ["RoFormerConfig", "RoFormerOnnxConfig"]

View File

@ -15,8 +15,13 @@
"""SegFormer model configuration"""
import warnings
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -143,4 +148,24 @@ class SegformerConfig(PreTrainedConfig):
self.semantic_loss_ignore_index = semantic_loss_ignore_index
__all__ = ["SegformerConfig"]
class SegformerOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
@property
def default_onnx_opset(self) -> int:
return 12
__all__ = ["SegformerConfig", "SegformerOnnxConfig"]

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""SqueezeBERT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -143,4 +147,21 @@ class SqueezeBertConfig(PreTrainedConfig):
self.output_groups = output_groups
__all__ = ["SqueezeBertConfig"]
# # Copied from transformers.models.bert.configuration_bert.BertOnxxConfig with Bert->SqueezeBert
class SqueezeBertOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
("token_type_ids", dynamic_axis),
]
)
__all__ = ["SqueezeBertConfig", "SqueezeBertOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""SwiftFormer model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -123,4 +129,20 @@ class SwiftFormerConfig(PreTrainedConfig):
self.batch_norm_eps = batch_norm_eps
__all__ = ["SwiftFormerConfig"]
class SwiftFormerOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["SwiftFormerConfig", "SwiftFormerOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""Swin Transformer model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_features_output_indices
@ -154,4 +160,20 @@ class SwinConfig(BackboneConfigMixin, PreTrainedConfig):
)
__all__ = ["SwinConfig"]
class SwinOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["SwinConfig", "SwinOnnxConfig"]

View File

@ -14,7 +14,10 @@
# limitations under the License.
"""T5 model configuration"""
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxSeq2SeqConfigWithPast
from ...utils import logging
@ -140,4 +143,29 @@ class T5Config(PreTrainedConfig):
)
__all__ = ["T5Config"]
class T5OnnxConfig(OnnxSeq2SeqConfigWithPast):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
common_inputs = {
"input_ids": {0: "batch", 1: "encoder_sequence"},
"attention_mask": {0: "batch", 1: "encoder_sequence"},
}
if self.use_past:
common_inputs["attention_mask"][1] = "past_encoder_sequence + sequence"
common_inputs["decoder_input_ids"] = {0: "batch"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
else:
common_inputs["decoder_input_ids"] = {0: "batch", 1: "decoder_sequence"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "decoder_sequence"}
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
return common_inputs
@property
def default_onnx_opset(self) -> int:
return 13
__all__ = ["T5Config", "T5OnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""Table Transformer model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
from ...utils.backbone_utils import verify_backbone_config_arguments
from ..auto import CONFIG_MAPPING, AutoConfig
@ -241,4 +247,26 @@ class TableTransformerConfig(PreTrainedConfig):
super().__init__(is_encoder_decoder=is_encoder_decoder, **kwargs)
__all__ = ["TableTransformerConfig"]
# Copied from transformers.models.detr.configuration_detr.DetrOnnxConfig
class TableTransformerOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
("pixel_mask", {0: "batch"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-5
@property
def default_onnx_opset(self) -> int:
return 12
__all__ = ["TableTransformerConfig", "TableTransformerOnnxConfig"]

View File

@ -14,7 +14,10 @@
# limitations under the License.
"""UMT5 model configuration"""
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxSeq2SeqConfigWithPast
from ...utils import logging
@ -144,4 +147,35 @@ class UMT5Config(PreTrainedConfig):
)
__all__ = ["UMT5Config"]
class UMT5OnnxConfig(OnnxSeq2SeqConfigWithPast):
@property
# Copied from transformers.models.t5.configuration_t5.T5OnnxConfig.inputs
def inputs(self) -> Mapping[str, Mapping[int, str]]:
common_inputs = {
"input_ids": {0: "batch", 1: "encoder_sequence"},
"attention_mask": {0: "batch", 1: "encoder_sequence"},
}
if self.use_past:
common_inputs["attention_mask"][1] = "past_encoder_sequence + sequence"
common_inputs["decoder_input_ids"] = {0: "batch"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
else:
common_inputs["decoder_input_ids"] = {0: "batch", 1: "decoder_sequence"}
common_inputs["decoder_attention_mask"] = {0: "batch", 1: "decoder_sequence"}
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
return common_inputs
@property
# Copied from transformers.models.t5.configuration_t5.T5OnnxConfig.default_onnx_opset
def default_onnx_opset(self) -> int:
return 13
@property
def atol_for_validation(self) -> float:
return 5e-4
__all__ = ["UMT5Config", "UMT5OnnxConfig"]

View File

@ -14,11 +14,21 @@
# See the License for the specific language governing permissions and
# limitations under the License.
from collections import OrderedDict
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
from ..auto.configuration_auto import AutoConfig
if TYPE_CHECKING:
from ... import PreTrainedTokenizerBase
logger = logging.get_logger(__name__)
@ -108,4 +118,100 @@ class VisionEncoderDecoderConfig(PreTrainedConfig):
return cls(encoder=encoder_config.to_dict(), decoder=decoder_config.to_dict(), **kwargs)
__all__ = ["VisionEncoderDecoderConfig"]
class VisionEncoderDecoderEncoderOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict({"last_hidden_state": {0: "batch", 1: "encoder_sequence"}})
class VisionEncoderDecoderDecoderOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
common_inputs = OrderedDict()
common_inputs["input_ids"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
common_inputs["attention_mask"] = {0: "batch", 1: "past_decoder_sequence + sequence"}
common_inputs["encoder_hidden_states"] = {0: "batch", 1: "encoder_sequence"}
return common_inputs
def generate_dummy_inputs(
self,
tokenizer: "PreTrainedTokenizerBase",
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
import torch
common_inputs = OrderedDict()
dummy_input = super().generate_dummy_inputs(
tokenizer,
batch_size=batch_size,
seq_length=seq_length,
is_pair=is_pair,
)
batch, encoder_sequence = dummy_input["input_ids"].shape
encoder_hidden_states_shape = (batch, encoder_sequence, self._config.encoder_hidden_size)
common_inputs["input_ids"] = dummy_input.pop("input_ids")
common_inputs["attention_mask"] = dummy_input.pop("attention_mask")
common_inputs["encoder_hidden_states"] = torch.zeros(encoder_hidden_states_shape)
return common_inputs
class VisionEncoderDecoderOnnxConfig(OnnxConfig):
@property
def inputs(self) -> None:
pass
def get_encoder_config(self, encoder_config: PreTrainedConfig) -> OnnxConfig:
r"""
Returns ONNX encoder config for `VisionEncoderDecoder` model.
Args:
encoder_config (`PreTrainedConfig`):
The encoder model's configuration to use when exporting to ONNX.
Returns:
[`VisionEncoderDecoderEncoderOnnxConfig`]: An instance of the ONNX configuration object
"""
return VisionEncoderDecoderEncoderOnnxConfig(encoder_config)
def get_decoder_config(
self, encoder_config: PreTrainedConfig, decoder_config: PreTrainedConfig, feature: str = "default"
) -> OnnxConfig:
r"""
Returns ONNX decoder config for `VisionEncoderDecoder` model.
Args:
encoder_config (`PreTrainedConfig`):
The encoder model's configuration to use when exporting to ONNX.
decoder_config (`PreTrainedConfig`):
The decoder model's configuration to use when exporting to ONNX
feature (`str`, *optional*):
The type of feature to export the model with.
Returns:
[`VisionEncoderDecoderDecoderOnnxConfig`]: An instance of the ONNX configuration object.
"""
decoder_config.encoder_hidden_size = encoder_config.hidden_size
return VisionEncoderDecoderDecoderOnnxConfig(decoder_config, feature)
__all__ = ["VisionEncoderDecoderConfig", "VisionEncoderDecoderOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""ViT model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -124,4 +130,20 @@ class ViTConfig(PreTrainedConfig):
self.pooler_act = pooler_act
__all__ = ["ViTConfig"]
class ViTOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
__all__ = ["ViTConfig", "ViTOnnxConfig"]

View File

@ -14,10 +14,19 @@
# limitations under the License.
"""Whisper model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any, Union
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig, OnnxSeq2SeqConfigWithPast
from ...utils import logging
if TYPE_CHECKING:
from ...feature_extraction_utils import FeatureExtractionMixin
from ...tokenization_utils_base import PreTrainedTokenizerBase
logger = logging.get_logger(__name__)
@ -276,4 +285,64 @@ class WhisperConfig(PreTrainedConfig):
)
__all__ = ["WhisperConfig"]
class WhisperOnnxConfig(OnnxSeq2SeqConfigWithPast):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
common_inputs = OrderedDict(
[
("input_features", {0: "batch", 1: "feature_size", 2: "encoder_sequence"}),
]
)
if self.use_past:
common_inputs["decoder_input_ids"] = {0: "batch"}
else:
common_inputs["decoder_input_ids"] = {0: "batch", 1: "decoder_sequence"}
if self.use_past:
self.fill_with_past_key_values_(common_inputs, direction="inputs")
return common_inputs
def generate_dummy_inputs(
self,
preprocessor: Union["PreTrainedTokenizerBase", "FeatureExtractionMixin"],
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
sampling_rate: int = 22050,
time_duration: float = 5.0,
frequency: int = 220,
) -> Mapping[str, Any]:
dummy_inputs = OrderedDict()
encoder_inputs = OnnxConfig.generate_dummy_inputs(
self,
preprocessor=preprocessor.feature_extractor,
batch_size=batch_size,
sampling_rate=sampling_rate,
time_duration=time_duration,
frequency=frequency,
)
encoder_sequence_length = encoder_inputs["input_features"].shape[2]
seq_length = encoder_sequence_length // 2 if self.use_past else seq_length
decoder_inputs = super().generate_dummy_inputs(
preprocessor.tokenizer,
batch_size,
seq_length,
is_pair,
)
dummy_inputs["input_features"] = encoder_inputs.pop("input_features")
dummy_inputs["decoder_input_ids"] = decoder_inputs.pop("decoder_input_ids")
if "past_key_values" in decoder_inputs:
dummy_inputs["past_key_values"] = decoder_inputs.pop("past_key_values")
return dummy_inputs
@property
def atol_for_validation(self) -> float:
return 1e-3
__all__ = ["WhisperConfig", "WhisperOnnxConfig"]

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""XLM configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -217,4 +221,21 @@ class XLMConfig(PreTrainedConfig):
super().__init__(pad_token_id=pad_token_id, bos_token_id=bos_token_id, **kwargs)
__all__ = ["XLMConfig"]
# Copied from transformers.models.bert.configuration_bert.BertOnnxConfig
class XLMOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
("token_type_ids", dynamic_axis),
]
)
__all__ = ["XLMConfig", "XLMOnnxConfig"]

View File

@ -15,7 +15,11 @@
# limitations under the License.
"""XLM-RoBERTa configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -126,4 +130,20 @@ class XLMRobertaConfig(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["XLMRobertaConfig"]
# Copied from transformers.models.roberta.configuration_roberta.RobertaOnnxConfig with Roberta->XLMRoberta
class XLMRobertaOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
__all__ = ["XLMRobertaConfig", "XLMRobertaOnnxConfig"]

View File

@ -14,7 +14,11 @@
# limitations under the License.
"""XLM_ROBERTa_XL configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -122,4 +126,20 @@ class XLMRobertaXLConfig(PreTrainedConfig):
self.classifier_dropout = classifier_dropout
__all__ = ["XLMRobertaXLConfig"]
# Copied from transformers.models.roberta.configuration_roberta.RobertaOnnxConfig with Roberta->XLMRobertaXL
class XLMRobertaXLOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
__all__ = ["XLMRobertaXLConfig", "XLMRobertaXLOnnxConfig"]

View File

@ -15,7 +15,11 @@
# limitations under the License.
"""X-MOD configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -154,4 +158,20 @@ class XmodConfig(PreTrainedConfig):
self.default_language = default_language
__all__ = ["XmodConfig"]
# Copied from transformers.models.roberta.configuration_roberta.RobertaOnnxConfig with Roberta->Xmod
class XmodOnnxConfig(OnnxConfig):
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
if self.task == "multiple-choice":
dynamic_axis = {0: "batch", 1: "choice", 2: "sequence"}
else:
dynamic_axis = {0: "batch", 1: "sequence"}
return OrderedDict(
[
("input_ids", dynamic_axis),
("attention_mask", dynamic_axis),
]
)
__all__ = ["XmodConfig", "XmodOnnxConfig"]

View File

@ -14,7 +14,13 @@
# limitations under the License.
"""YOLOS model configuration"""
from collections import OrderedDict
from collections.abc import Mapping
from packaging import version
from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig
from ...utils import logging
@ -149,4 +155,24 @@ class YolosConfig(PreTrainedConfig):
self.eos_coefficient = eos_coefficient
__all__ = ["YolosConfig"]
class YolosOnnxConfig(OnnxConfig):
torch_onnx_minimum_version = version.parse("1.11")
@property
def inputs(self) -> Mapping[str, Mapping[int, str]]:
return OrderedDict(
[
("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),
]
)
@property
def atol_for_validation(self) -> float:
return 1e-4
@property
def default_onnx_opset(self) -> int:
return 12
__all__ = ["YolosConfig", "YolosOnnxConfig"]

View File

@ -0,0 +1,45 @@
# Copyright 2020 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from typing import TYPE_CHECKING
from ..utils import _LazyModule
_import_structure = {
"config": [
"EXTERNAL_DATA_FORMAT_SIZE_LIMIT",
"OnnxConfig",
"OnnxConfigWithPast",
"OnnxSeq2SeqConfigWithPast",
"PatchingSpec",
],
"utils": ["ParameterFormat", "compute_serialized_parameters_size"],
}
if TYPE_CHECKING:
from .config import (
EXTERNAL_DATA_FORMAT_SIZE_LIMIT,
OnnxConfig,
OnnxConfigWithPast,
OnnxSeq2SeqConfigWithPast,
PatchingSpec,
)
from .utils import ParameterFormat, compute_serialized_parameters_size
else:
import sys
sys.modules[__name__] = _LazyModule(__name__, globals()["__file__"], _import_structure, module_spec=__spec__)

View File

@ -0,0 +1,748 @@
# Copyright 2021 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import copy
import dataclasses
import warnings
from abc import ABC, abstractmethod
from collections import OrderedDict
from collections.abc import Callable, Iterable, Mapping
from typing import TYPE_CHECKING, Any, Optional, Union
import numpy as np
from packaging import version
from ..utils import is_torch_available, is_vision_available, logging
from .utils import ParameterFormat, compute_effective_axis_dimension, compute_serialized_parameters_size
if TYPE_CHECKING:
from ..configuration_utils import PreTrainedConfig
from ..feature_extraction_utils import FeatureExtractionMixin
from ..image_processing_utils import ImageProcessingMixin
from ..tokenization_utils_base import PreTrainedTokenizerBase
if is_vision_available():
from PIL import Image
logger = logging.get_logger(__name__)
DEFAULT_ONNX_OPSET = 11
# 2 Gb
EXTERNAL_DATA_FORMAT_SIZE_LIMIT = 2 * 1024 * 1024 * 1024
@dataclasses.dataclass
class PatchingSpec:
"""
Data class that holds patching specifications.
Args:
o: Module / object where the op to patch is located
name: Name of the op to monkey patch
custom_op: Custom op that patches the original op
orig_op: Original op that is being patched
op_wrapper: Wrapper (optional) that wraps both the original and custom ops.
It is useful for ops that are class or static methods for instance.
"""
o: Any
name: str
custom_op: Callable
orig_op: Optional[Callable] = None
op_wrapper: Optional[Callable] = None
class OnnxConfig(ABC):
"""
Base class for ONNX exportable model describing metadata on how to export the model through the ONNX format.
"""
default_fixed_batch = 2
default_fixed_sequence = 8
default_fixed_num_choices = 4
torch_onnx_minimum_version = version.parse("1.8")
_tasks_to_common_outputs = {
"causal-lm": OrderedDict({"logits": {0: "batch", 1: "sequence"}}),
"default": OrderedDict({"last_hidden_state": {0: "batch", 1: "sequence"}}),
"image-classification": OrderedDict({"logits": {0: "batch", 1: "sequence"}}),
"image-segmentation": OrderedDict(
{
"logits": {0: "batch", 1: "sequence"},
"pred_boxes": {0: "batch", 1: "sequence"},
"pred_masks": {0: "batch", 1: "sequence"},
}
),
"masked-im": OrderedDict({"logits": {0: "batch", 1: "sequence"}}),
"masked-lm": OrderedDict({"logits": {0: "batch", 1: "sequence"}}),
"multiple-choice": OrderedDict({"logits": {0: "batch"}}),
"object-detection": OrderedDict(
{
"logits": {0: "batch", 1: "sequence"},
"pred_boxes": {0: "batch", 1: "sequence"},
}
),
"question-answering": OrderedDict(
{
"start_logits": {0: "batch", 1: "sequence"},
"end_logits": {0: "batch", 1: "sequence"},
}
),
"semantic-segmentation": OrderedDict({"logits": {0: "batch", 1: "num_labels", 2: "height", 3: "width"}}),
"seq2seq-lm": OrderedDict({"logits": {0: "batch", 1: "decoder_sequence"}}),
"sequence-classification": OrderedDict({"logits": {0: "batch"}}),
"token-classification": OrderedDict({"logits": {0: "batch", 1: "sequence"}}),
"vision2seq-lm": OrderedDict({"logits": {0: "batch", 1: "sequence"}}),
"speech2seq-lm": OrderedDict({"logits": {0: "batch", 1: "sequence"}}),
}
def __init__(
self, config: "PreTrainedConfig", task: str = "default", patching_specs: Optional[list[PatchingSpec]] = None
):
self._config = config
if task not in self._tasks_to_common_outputs:
raise ValueError(
f"{task} is not a supported task, supported tasks: {self._tasks_to_common_outputs.keys()}"
)
self.task = task
self._patching_specs = []
for spec in patching_specs if patching_specs is not None else []:
final_spec = spec
if spec.orig_op is None:
final_spec = dataclasses.replace(spec, orig_op=getattr(spec.o, spec.name))
self._patching_specs.append(final_spec)
@classmethod
def from_model_config(cls, config: "PreTrainedConfig", task: str = "default") -> "OnnxConfig":
"""
Instantiate a OnnxConfig for a specific model
Args:
config: The model's configuration to use when exporting to ONNX
Returns:
OnnxConfig for this model
"""
return cls(config, task=task)
@property
@abstractmethod
def inputs(self) -> Mapping[str, Mapping[int, str]]:
"""
Mapping containing the axis definition of the input tensors to provide to the model
Returns:
For each input: its name associated to the axes symbolic name and the axis position within the tensor
"""
raise NotImplementedError()
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
"""
Mapping containing the axis definition of the output tensors to provide to the model
Returns:
For each output: its name associated to the axes symbolic name and the axis position within the tensor
"""
common_outputs = self._tasks_to_common_outputs[self.task]
return copy.deepcopy(common_outputs)
@property
def values_override(self) -> Optional[Mapping[str, Any]]:
"""
Dictionary of keys to override in the model's config before exporting
Returns:
Dictionary with the keys (and their corresponding values) to override
"""
if hasattr(self._config, "use_cache"):
return {"use_cache": False}
return None
@property
def default_batch_size(self) -> int:
"""
The default batch size to use if no other indication
Returns:
Integer > 0
"""
# Using 2 avoid ONNX making assumption about single sample batch
return OnnxConfig.default_fixed_batch
@property
def default_sequence_length(self) -> int:
"""
The default sequence length to use if no other indication
Returns:
Integer > 0
"""
return OnnxConfig.default_fixed_sequence
@property
def default_num_choices(self) -> int:
"""
The default number of choices to use if no other indication
Returns:
Integer > 0
"""
return OnnxConfig.default_fixed_num_choices
@property
def default_onnx_opset(self) -> int:
"""
Which onnx opset to use when exporting the model
Returns:
Integer ONNX Opset version
"""
return DEFAULT_ONNX_OPSET
@property
def atol_for_validation(self) -> float:
"""
What absolute tolerance value to use during model conversion validation.
Returns:
Float absolute tolerance value.
"""
return 1e-5
@property
def is_torch_support_available(self) -> bool:
"""
The minimum PyTorch version required to export the model.
Returns:
`bool`: Whether the installed version of PyTorch is compatible with the model.
"""
if is_torch_available():
from transformers.utils import get_torch_version
return version.parse(get_torch_version()) >= self.torch_onnx_minimum_version
else:
return False
@staticmethod
def use_external_data_format(num_parameters: int) -> bool:
"""
Flag indicating if the model requires using external data format
Args:
num_parameters: Number of parameter on the model
Returns:
True if model.num_parameters() * size_of(float32) >= 2Gb False otherwise
"""
return (
compute_serialized_parameters_size(num_parameters, ParameterFormat.Float)
>= EXTERNAL_DATA_FORMAT_SIZE_LIMIT
)
def _generate_dummy_images(
self, batch_size: int = 2, num_channels: int = 3, image_height: int = 40, image_width: int = 40
):
images = []
for _ in range(batch_size):
data = np.random.rand(image_height, image_width, num_channels) * 255
images.append(Image.fromarray(data.astype("uint8")).convert("RGB"))
return images
def _generate_dummy_audio(
self, batch_size: int = 2, sampling_rate: int = 22050, time_duration: float = 5.0, frequency: int = 220
):
audio_data = []
for _ in range(batch_size):
# time variable
t = np.linspace(0, time_duration, int(time_duration * sampling_rate), endpoint=False)
# generate pure sine wave at `frequency` Hz
audio_data.append(0.5 * np.sin(2 * np.pi * frequency * t))
return audio_data
def generate_dummy_inputs(
self,
preprocessor: Union["PreTrainedTokenizerBase", "FeatureExtractionMixin", "ImageProcessingMixin"],
batch_size: int = -1,
seq_length: int = -1,
num_choices: int = -1,
is_pair: bool = False,
num_channels: int = 3,
image_width: int = 40,
image_height: int = 40,
sampling_rate: int = 22050,
time_duration: float = 5.0,
frequency: int = 220,
tokenizer: Optional["PreTrainedTokenizerBase"] = None,
) -> Mapping[str, Any]:
"""
Generate inputs to provide to the ONNX exporter
Args:
preprocessor: ([`PreTrainedTokenizerBase`], [`FeatureExtractionMixin`], or [`ImageProcessingMixin`]):
The preprocessor associated with this model configuration.
batch_size (`int`, *optional*, defaults to -1):
The batch size to export the model for (-1 means dynamic axis).
num_choices (`int`, *optional*, defaults to -1):
The number of candidate answers provided for multiple choice task (-1 means dynamic axis).
seq_length (`int`, *optional*, defaults to -1):
The sequence length to export the model for (-1 means dynamic axis).
is_pair (`bool`, *optional*, defaults to `False`):
Indicate if the input is a pair (sentence 1, sentence 2)
num_channels (`int`, *optional*, defaults to 3):
The number of channels of the generated images.
image_width (`int`, *optional*, defaults to 40):
The width of the generated images.
image_height (`int`, *optional*, defaults to 40):
The height of the generated images.
sampling_rate (`int`, *optional* defaults to 22050)
The sampling rate for audio data generation.
time_duration (`float`, *optional* defaults to 5.0)
Total seconds of sampling for audio data generation.
frequency (`int`, *optional* defaults to 220)
The desired natural frequency of generated audio.
Returns:
Mapping[str, Tensor] holding the kwargs to provide to the model's forward function
"""
from ..feature_extraction_utils import FeatureExtractionMixin
from ..image_processing_utils import ImageProcessingMixin
from ..tokenization_utils_base import PreTrainedTokenizerBase
if isinstance(preprocessor, PreTrainedTokenizerBase) and tokenizer is not None:
raise ValueError("You cannot provide both a tokenizer and a preprocessor to generate dummy inputs.")
if tokenizer is not None:
warnings.warn(
"The `tokenizer` argument is deprecated and will be removed in version 5 of Transformers. Use"
" `preprocessor` instead.",
FutureWarning,
)
logger.warning("Overwriting the `preprocessor` argument with `tokenizer` to generate dummy inputs.")
preprocessor = tokenizer
if isinstance(preprocessor, PreTrainedTokenizerBase):
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(
batch_size, fixed_dimension=OnnxConfig.default_fixed_batch, num_token_to_add=0
)
# If dynamic axis (-1) we forward with a fixed dimension of 8 tokens to avoid optimizations made by ONNX
token_to_add = preprocessor.num_special_tokens_to_add(is_pair)
seq_length = compute_effective_axis_dimension(
seq_length, fixed_dimension=OnnxConfig.default_fixed_sequence, num_token_to_add=token_to_add
)
# Generate dummy inputs according to compute batch and sequence
input_token = (
preprocessor.unk_token
if (preprocessor.unk_token is not None and len(preprocessor.unk_token) > 0)
else "0"
)
dummy_input = [" ".join([input_token]) * seq_length] * batch_size
if self.task == "multiple-choice":
# If dynamic axis (-1) we forward with a fixed dimension of 4 candidate answers to avoid optimizations
# made by ONNX
num_choices = compute_effective_axis_dimension(
num_choices, fixed_dimension=OnnxConfig.default_fixed_num_choices, num_token_to_add=0
)
dummy_input = dummy_input * num_choices
# The shape of the tokenized inputs values is [batch_size * num_choices, seq_length]
tokenized_input = preprocessor(dummy_input, text_pair=dummy_input)
# Unflatten the tokenized inputs values expanding it to the shape [batch_size, num_choices, seq_length]
for k, v in tokenized_input.items():
tokenized_input[k] = [v[i : i + num_choices] for i in range(0, len(v), num_choices)]
return dict(tokenized_input.convert_to_tensors(tensor_type="pt"))
return dict(preprocessor(dummy_input, return_tensors="pt"))
elif isinstance(preprocessor, ImageProcessingMixin):
if preprocessor.model_input_names[0] != "pixel_values":
raise ValueError(
f"The `preprocessor` is an image processor ({preprocessor.__class__.__name__}) and expects"
f' `model_input_names[0]` to be "pixel_values", but got {preprocessor.model_input_names[0]}'
)
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(batch_size, fixed_dimension=OnnxConfig.default_fixed_batch)
dummy_input = self._generate_dummy_images(batch_size, num_channels, image_height, image_width)
return dict(preprocessor(images=dummy_input, return_tensors="pt"))
elif isinstance(preprocessor, FeatureExtractionMixin) and preprocessor.model_input_names[0] == "pixel_values":
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(batch_size, fixed_dimension=OnnxConfig.default_fixed_batch)
dummy_input = self._generate_dummy_images(batch_size, num_channels, image_height, image_width)
return dict(preprocessor(images=dummy_input, return_tensors="pt"))
elif (
isinstance(preprocessor, FeatureExtractionMixin) and preprocessor.model_input_names[0] == "input_features"
):
# If dynamic axis (-1) we forward with a fixed dimension of 2 samples to avoid optimizations made by ONNX
batch_size = compute_effective_axis_dimension(batch_size, fixed_dimension=OnnxConfig.default_fixed_batch)
dummy_input = self._generate_dummy_audio(batch_size, sampling_rate, time_duration, frequency)
return dict(preprocessor(dummy_input, return_tensors="pt"))
else:
raise ValueError(
"Unable to generate dummy inputs for the model. Please provide a tokenizer or a preprocessor."
)
def generate_dummy_inputs_onnxruntime(self, reference_model_inputs: Mapping[str, Any]) -> Mapping[str, Any]:
"""
Generate inputs for ONNX Runtime using the reference model inputs. Override this to run inference with seq2seq
models which have the encoder and decoder exported as separate ONNX files.
Args:
reference_model_inputs ([`Mapping[str, Tensor]`):
Reference inputs for the model.
Returns:
`Mapping[str, Tensor]`: The mapping holding the kwargs to provide to the model's forward function
"""
return reference_model_inputs
def patch_ops(self):
for spec in self._patching_specs:
custom_op = spec.custom_op if spec.op_wrapper is None else spec.op_wrapper(spec.custom_op)
setattr(spec.o, spec.name, custom_op)
def restore_ops(self):
for spec in self._patching_specs:
orig_op = spec.orig_op if spec.op_wrapper is None else spec.op_wrapper(spec.orig_op)
setattr(spec.o, spec.name, orig_op)
@classmethod
def flatten_output_collection_property(cls, name: str, field: Iterable[Any]) -> dict[str, Any]:
"""
Flatten any potential nested structure expanding the name of the field with the index of the element within the
structure.
Args:
name: The name of the nested structure
field: The structure to, potentially, be flattened
Returns:
(dict[str, Any]): Outputs with flattened structure and key mapping this new structure.
"""
from itertools import chain
return {f"{name}.{idx}": item for idx, item in enumerate(chain.from_iterable(field))}
class OnnxConfigWithPast(OnnxConfig, ABC):
def __init__(
self,
config: "PreTrainedConfig",
task: str = "default",
patching_specs: Optional[list[PatchingSpec]] = None,
use_past: bool = False,
):
super().__init__(config, task=task, patching_specs=patching_specs)
self.use_past = use_past
@classmethod
def with_past(cls, config: "PreTrainedConfig", task: str = "default") -> "OnnxConfigWithPast":
"""
Instantiate a OnnxConfig with `use_past` attribute set to True
Args:
config: The underlying model's config to use when exporting to ONNX
Returns:
OnnxConfig with `.use_past = True`
"""
return cls(config, task=task, use_past=True)
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
common_outputs = super().outputs
if self.use_past:
self.fill_with_past_key_values_(common_outputs, direction="outputs")
return common_outputs
@property
def values_override(self) -> Optional[Mapping[str, Any]]:
if hasattr(self._config, "use_cache"):
return {"use_cache": self.use_past}
return None
@property
def num_layers(self) -> int:
"""
The number of layers attribute retrieved from the model config. Override this for model configs where the
number of layers attribute is not called `num_layers`.
"""
if not hasattr(self._config, "num_layers"):
raise AttributeError(
"could not find the number of layers attribute in the model configuration, override the num_layers"
" property of the model OnnxConfig to solve this"
)
return self._config.num_layers
@property
def num_attention_heads(self) -> int:
"""
The number of attention heads attribute retrieved from the model config. Override this for model configs where
the number of attention heads attribute is not called `num_attention_heads`.
"""
if not hasattr(self._config, "num_attention_heads"):
raise AttributeError(
"could not find the number of attention heads attribute in the model configuration, override the"
" num_attention_heads property of the model OnnxConfig to solve this"
)
return self._config.num_attention_heads
def generate_dummy_inputs(
self,
tokenizer: "PreTrainedTokenizerBase",
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
# TODO: should we set seq_length = 1 when self.use_past = True?
common_inputs = super().generate_dummy_inputs(
tokenizer,
batch_size=batch_size,
seq_length=seq_length,
is_pair=is_pair,
)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch, seqlen = common_inputs["input_ids"].shape
# Not using the same length for past_key_values
past_key_values_length = seqlen + 2
shape = (
batch,
self.num_attention_heads,
past_key_values_length,
self._config.hidden_size // self.num_attention_heads,
)
if "attention_mask" in common_inputs:
mask_dtype = common_inputs["attention_mask"].dtype
common_inputs["attention_mask"] = torch.cat(
[common_inputs["attention_mask"], torch.ones(batch, past_key_values_length, dtype=mask_dtype)],
dim=1,
)
common_inputs["past_key_values"] = []
for _ in range(self.num_layers):
common_inputs["past_key_values"].append((torch.zeros(shape), torch.zeros(shape)))
return common_inputs
def fill_with_past_key_values_(
self, inputs_or_outputs: Mapping[str, Mapping[int, str]], direction: str, inverted_values_shape: bool = False
):
"""
Fill the input_or_outputs mapping with past_key_values dynamic axes considering.
Args:
inputs_or_outputs: The mapping to fill.
direction: either "inputs" or "outputs", it specifies whether input_or_outputs is the input mapping or the
output mapping, this is important for axes naming.
inverted_values_shape:
If `True`, store values on dynamic axis 1, else on axis 2.
"""
if direction not in ["inputs", "outputs"]:
raise ValueError(f'direction must either be "inputs" or "outputs", but {direction} was given')
name = "past_key_values" if direction == "inputs" else "present"
for i in range(self.num_layers):
inputs_or_outputs[f"{name}.{i}.key"] = {0: "batch", 2: "past_sequence + sequence"}
if inverted_values_shape:
inputs_or_outputs[f"{name}.{i}.value"] = {0: "batch", 1: "past_sequence + sequence"}
else:
inputs_or_outputs[f"{name}.{i}.value"] = {0: "batch", 2: "past_sequence + sequence"}
def _flatten_past_key_values_(self, flattened_output, name, idx, t):
flattened_output[f"{name}.{idx}.key"] = t[0]
flattened_output[f"{name}.{idx}.value"] = t[1]
def flatten_output_collection_property(self, name: str, field: Iterable[Any]) -> dict[str, Any]:
flattened_output = {}
if name in ["present", "past_key_values"]:
for idx, t in enumerate(field):
self._flatten_past_key_values_(flattened_output, name, idx, t)
else:
flattened_output = super().flatten_output_collection_property(name, field)
return flattened_output
class OnnxSeq2SeqConfigWithPast(OnnxConfigWithPast):
@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
common_outputs = super(OnnxConfigWithPast, self).outputs
# Renaming the outputs axes properly.
for name, axes_names in common_outputs.items():
sequence_name = "encoder_sequence" if "encoder" in name else "decoder_sequence"
for axis_idx, name in axes_names.items():
if "sequence" in name:
axes_names[axis_idx] = sequence_name
# We reset the value as the order in common_outputs (OrderedDict) is lost otherwise
else:
axes_names[axis_idx] = name
if self.use_past:
self.fill_with_past_key_values_(common_outputs, direction="outputs")
return common_outputs
@property
def num_layers(self) -> tuple[int, ...]:
try:
num_layers = super().num_layers
num_layers = (num_layers, num_layers)
except AttributeError:
if hasattr(self._config, "encoder_layers") and hasattr(self._config, "decoder_layers"):
num_layers = (self._config.encoder_layers, self._config.decoder_layers)
else:
raise AttributeError(
"could not find the number of encoder and decoder layers attributes in the model configuration,"
" override the num_layers property of the model OnnxConfig to solve this"
)
return num_layers
@property
def num_attention_heads(self) -> tuple[int, ...]:
try:
num_attention_heads = super().num_attention_heads
num_attention_heads = (num_attention_heads, num_attention_heads)
except AttributeError:
if hasattr(self._config, "encoder_attention_heads") and hasattr(self._config, "decoder_attention_heads"):
num_attention_heads = (self._config.encoder_attention_heads, self._config.decoder_attention_heads)
else:
raise AttributeError(
"could not find the number of attention heads for the encoder and the decoder attributes in the"
" model configuration, override the num_attention_heads property of the model OnnxConfig to solve"
" this"
)
return num_attention_heads
def generate_dummy_inputs(
self,
tokenizer: Optional["PreTrainedTokenizerBase"],
batch_size: int = -1,
seq_length: int = -1,
is_pair: bool = False,
) -> Mapping[str, Any]:
encoder_inputs = super(OnnxConfigWithPast, self).generate_dummy_inputs(
tokenizer,
batch_size=batch_size,
seq_length=seq_length,
is_pair=is_pair,
)
# Generate decoder inputs
decoder_seq_length = seq_length if not self.use_past else 1
decoder_inputs = super(OnnxConfigWithPast, self).generate_dummy_inputs(
tokenizer,
batch_size=batch_size,
seq_length=decoder_seq_length,
is_pair=is_pair,
)
decoder_inputs = {f"decoder_{name}": tensor for name, tensor in decoder_inputs.items()}
common_inputs = dict(**encoder_inputs, **decoder_inputs)
if self.use_past:
if not is_torch_available():
raise ValueError("Cannot generate dummy past_keys inputs without PyTorch installed.")
else:
import torch
batch = common_inputs["input_ids"].shape[0]
encoder_seq_length = common_inputs["input_ids"].shape[1]
decoder_seq_length = common_inputs["decoder_input_ids"].shape[1]
num_encoder_attention_heads, num_decoder_attention_heads = self.num_attention_heads
encoder_shape = (
batch,
num_encoder_attention_heads,
encoder_seq_length,
self._config.hidden_size // num_encoder_attention_heads,
)
decoder_shape = (
batch,
num_decoder_attention_heads,
# Not using the same length for past_key_values
decoder_seq_length + 3,
self._config.hidden_size // num_decoder_attention_heads,
)
common_inputs["past_key_values"] = []
# If the number of encoder and decoder layers are present in the model configuration, both are considered
num_encoder_layers, num_decoder_layers = self.num_layers
min_num_layers = min(num_encoder_layers, num_decoder_layers)
max_num_layers = max(num_encoder_layers, num_decoder_layers) - min_num_layers
remaining_side_name = "encoder" if num_encoder_layers > num_decoder_layers else "decoder"
for _ in range(min_num_layers):
# For encoder-decoder models, past_key_values contains pre-computed values for both the encoder and the
# decoder layers, hence a tuple of 4 tensors instead of 2
common_inputs["past_key_values"].append(
(
torch.zeros(decoder_shape),
torch.zeros(decoder_shape),
torch.zeros(encoder_shape),
torch.zeros(encoder_shape),
)
)
# TODO: test this.
shape = encoder_shape if remaining_side_name == "encoder" else decoder_shape
for _ in range(min_num_layers, max_num_layers):
common_inputs["past_key_values"].append((torch.zeros(shape), torch.zeros(shape)))
return common_inputs
def fill_with_past_key_values_(self, inputs_or_outputs: Mapping[str, Mapping[int, str]], direction: str):
if direction not in ["inputs", "outputs"]:
raise ValueError(f'direction must either be "inputs" or "outputs", but {direction} was given')
name = "past_key_values" if direction == "inputs" else "present"
# If the number of encoder and decoder layers are present in the model configuration, both are considered
num_encoder_layers, num_decoder_layers = self.num_layers
min_num_layers = min(num_encoder_layers, num_decoder_layers)
max_num_layers = max(num_encoder_layers, num_decoder_layers) - min_num_layers
remaining_side_name = "encoder" if num_encoder_layers > num_decoder_layers else "decoder"
encoder_sequence = "past_encoder_sequence"
decoder_sequence = "past_decoder_sequence" if direction == "inputs" else "past_decoder_sequence + sequence"
for i in range(min_num_layers):
inputs_or_outputs[f"{name}.{i}.decoder.key"] = {0: "batch", 2: decoder_sequence}
inputs_or_outputs[f"{name}.{i}.decoder.value"] = {0: "batch", 2: decoder_sequence}
inputs_or_outputs[f"{name}.{i}.encoder.key"] = {0: "batch", 2: encoder_sequence}
inputs_or_outputs[f"{name}.{i}.encoder.value"] = {0: "batch", 2: encoder_sequence}
for i in range(min_num_layers, max_num_layers):
if remaining_side_name == "encoder":
axes_info = {0: "batch", 2: encoder_sequence}
else:
axes_info = {0: "batch", 2: decoder_sequence}
inputs_or_outputs[f"{name}.{i}.{remaining_side_name}.key"] = axes_info
def _flatten_past_key_values_(self, flattened_output, name, idx, t):
flattened_output[f"{name}.{idx}.decoder.key"] = t[0]
flattened_output[f"{name}.{idx}.decoder.value"] = t[1]
flattened_output[f"{name}.{idx}.encoder.key"] = t[2]
flattened_output[f"{name}.{idx}.encoder.value"] = t[3]

View File

@ -0,0 +1,109 @@
# Copyright 2021 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from ctypes import c_float, sizeof
from enum import Enum
from typing import TYPE_CHECKING, Optional, Union
if TYPE_CHECKING:
from .. import AutoFeatureExtractor, AutoProcessor, AutoTokenizer # tests_ignore
class ParameterFormat(Enum):
Float = c_float
@property
def size(self) -> int:
"""
Number of byte required for this data type
Returns:
Integer > 0
"""
return sizeof(self.value)
def compute_effective_axis_dimension(dimension: int, fixed_dimension: int, num_token_to_add: int = 0) -> int:
"""
Args:
dimension:
fixed_dimension:
num_token_to_add:
Returns:
"""
# < 0 is possible if using a dynamic axis
if dimension <= 0:
dimension = fixed_dimension
dimension -= num_token_to_add
return dimension
def compute_serialized_parameters_size(num_parameters: int, dtype: ParameterFormat) -> int:
"""
Compute the size taken by all the parameters in the given the storage format when serializing the model
Args:
num_parameters: Number of parameters to be saved
dtype: The data format each parameter will be saved
Returns:
Size (in byte) taken to save all the parameters
"""
return num_parameters * dtype.size
def get_preprocessor(model_name: str) -> Optional[Union["AutoTokenizer", "AutoFeatureExtractor", "AutoProcessor"]]:
"""
Gets a preprocessor (tokenizer, feature extractor or processor) that is available for `model_name`.
Args:
model_name (`str`): Name of the model for which a preprocessor are loaded.
Returns:
`Optional[Union[AutoTokenizer, AutoFeatureExtractor, AutoProcessor]]`:
If a processor is found, it is returned. Otherwise, if a tokenizer or a feature extractor exists, it is
returned. If both a tokenizer and a feature extractor exist, an error is raised. The function returns
`None` if no preprocessor is found.
"""
# Avoid circular imports by only importing this here.
from .. import AutoFeatureExtractor, AutoProcessor, AutoTokenizer # tests_ignore
try:
return AutoProcessor.from_pretrained(model_name)
except (ValueError, OSError, KeyError):
tokenizer, feature_extractor = None, None
try:
tokenizer = AutoTokenizer.from_pretrained(model_name)
except (OSError, KeyError):
pass
try:
feature_extractor = AutoFeatureExtractor.from_pretrained(model_name)
except (OSError, KeyError):
pass
if tokenizer is not None and feature_extractor is not None:
raise ValueError(
f"Couldn't auto-detect preprocessor for {model_name}. Found both a tokenizer and a feature extractor."
)
elif tokenizer is None and feature_extractor is None:
return None
elif tokenizer is not None:
return tokenizer
else:
return feature_extractor

View File

@ -209,6 +209,7 @@ if is_peft_available():
if is_accelerate_available():
from accelerate import Accelerator, skip_first_batches
from accelerate import __version__ as accelerate_version
from accelerate.state import AcceleratorState
from accelerate.utils import (
DataLoaderConfiguration,
@ -4966,12 +4967,7 @@ class Trainer:
# this would have been updated above, no need for it anymore
accelerator_config.pop("gradient_accumulation_kwargs")
args = {
"mixed_precision": self.args.mixed_precision,
"dataloader_config": dataloader_config,
"fsdp_plugin": self.args.fsdp_plugin,
"deepspeed_plugin": self.args.deepspeed_plugin,
}
args = {"deepspeed_plugin": self.args.deepspeed_plugin, "dataloader_config": dataloader_config}
# We defer compatibility checks to accelerator
if self.args.parallelism_config is not None:
@ -4985,7 +4981,7 @@ class Trainer:
if getattr(self.model, "tp_size", None) is not None and self.model.tp_size > 1:
self.is_tp_enabled = True
if self.args.parallelism_config is not None:
if is_accelerate_available("1.10.1"):
if version.parse(accelerate_version) > version.parse("1.10.1"):
if self.args.parallelism_config is not None:
from accelerate import ParallelismConfig
@ -4993,15 +4989,6 @@ class Trainer:
else:
raise ValueError("Requires accelerate>1.10.1 to use Tensor Parallelism.")
if is_accelerate_available("1.2.0"):
# it we don't have the correct version, we will rely on env var instead that were set in TrainingArguments
from accelerate.utils import TorchDynamoPlugin
dynamo_plugin = TorchDynamoPlugin(
backend=self.args.torch_compile_backend, mode=self.args.torch_compile_mode
)
args["dynamo_plugin"] = dynamo_plugin
# create accelerator object
self.accelerator = Accelerator(**args)
# some Trainer classes need to use `gather` instead of `gather_for_metrics`, thus we store a flag

View File

@ -462,8 +462,6 @@ class TrainingArguments:
fsdp json config file (e.g., `fsdp_config.json`) or an already loaded json file as `dict`.
A List of config and its options:
- fsdp_version (`int`, *optional*, defaults to `1`):
The version of FSDP to use. Defaults to 1.
- min_num_params (`int`, *optional*, defaults to `0`):
FSDP's minimum number of parameters for Default Auto Wrapping. (useful only when `fsdp` field is
passed).
@ -1536,28 +1534,6 @@ class TrainingArguments:
self.optim = OptimizerNames(self.optim)
# We need to get from the env as we are setting deepspeed mixed precison afterwards
self.mixed_precision = os.environ.get("ACCELERATE_MIXED_PRECISION", "no")
if self.fp16:
self.mixed_precision = "fp16"
elif self.bf16:
self.mixed_precision = "bf16"
if (self.torch_compile_mode is not None or self.torch_compile_backend is not None) and not self.torch_compile:
self.torch_compile = True
if self.torch_compile and self.torch_compile_backend is None:
if not self.use_cpu and is_torch_hpu_available():
self.torch_compile_backend = "hpu_backend"
else:
self.torch_compile_backend = "inductor"
if self.torch_compile:
# TODO: remove this once we've bumped the minimum accelerate version
if not is_accelerate_available("1.2.0"):
os.environ["ACCELERATE_DYNAMO_BACKEND"] = self.torch_compile_backend
if self.torch_compile_mode is not None:
os.environ["ACCELERATE_DYNAMO_MODE"] = self.torch_compile_mode
# We need to setup the accelerator config here *before* the first call to `self.device`
if is_accelerate_available():
if not isinstance(self.accelerator_config, AcceleratorConfig):
@ -1584,6 +1560,22 @@ class TrainingArguments:
if is_torch_available():
self.device
if (self.torch_compile_mode is not None or self.torch_compile_backend is not None) and not self.torch_compile:
self.torch_compile = True
if self.torch_compile and self.torch_compile_backend is None:
if not self.use_cpu and is_torch_hpu_available():
self.torch_compile_backend = "hpu_backend"
else:
self.torch_compile_backend = "inductor"
# accelerate integration for torch compile
if self.torch_compile:
# set env vars for accelerate
prefix = "ACCELERATE_DYNAMO_"
os.environ[prefix + "BACKEND"] = self.torch_compile_backend
if self.torch_compile_mode is not None:
os.environ[prefix + "MODE"] = self.torch_compile_mode
if is_torch_available() and self.torch_compile:
if is_torch_tf32_available():
if self.tf32 is None and not self.fp16 or self.bf16:
@ -1620,6 +1612,14 @@ class TrainingArguments:
torch.backends.cudnn.allow_tf32 = False
# no need to assert on else
# if training args is specified, it will override the one specified in the accelerate config
mixed_precision_dtype = os.environ.get("ACCELERATE_MIXED_PRECISION", "no")
if self.fp16:
mixed_precision_dtype = "fp16"
elif self.bf16:
mixed_precision_dtype = "bf16"
os.environ["ACCELERATE_MIXED_PRECISION"] = mixed_precision_dtype
if self.report_to == "all" or self.report_to == ["all"]:
# Import at runtime to avoid a circular import.
from .integrations import get_available_reporting_integrations
@ -1648,19 +1648,130 @@ class TrainingArguments:
if not isinstance(self.warmup_steps, int) or self.warmup_steps < 0:
raise ValueError("warmup_steps must be of type int and must be 0 or a positive integer.")
if self.fsdp is None:
self.fsdp = []
elif self.fsdp is True:
self.fsdp = [FSDPOption.FULL_SHARD]
elif isinstance(self.fsdp, str):
self.fsdp = [FSDPOption(s) for s in self.fsdp.split()]
if self.fsdp == [FSDPOption.OFFLOAD]:
raise ValueError(
"`--fsdp offload` can't work on its own. It needs to be added to `--fsdp full_shard` or "
'`--fsdp shard_grad_op`. For example, `--fsdp "full_shard offload"`.'
)
elif FSDPOption.FULL_SHARD in self.fsdp and FSDPOption.SHARD_GRAD_OP in self.fsdp:
raise ValueError("`--fsdp full_shard` is not compatible with `--fsdp shard_grad_op`.")
if self.gradient_checkpointing and (
FSDPOption.FULL_SHARD in self.fsdp or FSDPOption.HYBRID_SHARD in self.fsdp
):
logger.warning(
"When using FSDP full shard, instead of using `gradient_checkpointing` in TrainingArguments, please"
" use `activation_checkpointing` in `fsdp_config`. The former introduces a redundant AllGather"
" operation in backward pass. Reference: https://github.com/huggingface/transformers/issues/30404"
)
if self.fsdp_config is None:
self.fsdp_config = {}
if isinstance(self.fsdp_config, str):
if len(self.fsdp) == 0:
warnings.warn("`--fsdp_config` is useful only when `--fsdp` is specified.")
with open(self.fsdp_config, encoding="utf-8") as f:
self.fsdp_config = json.load(f)
if self.fsdp_config is not None and isinstance(self.fsdp_config, dict):
for k in list(self.fsdp_config.keys()):
if k.startswith("fsdp_"):
v = self.fsdp_config.pop(k)
self.fsdp_config[k[5:]] = v
self.fsdp_config["min_num_params"] = self.fsdp_config.get("min_num_params", 0)
# if fsdp_config["transformer_layer_cls_to_wrap"] is specified as a string, convert it to a list with a single object
if isinstance(self.fsdp_config.get("transformer_layer_cls_to_wrap", None), str):
self.fsdp_config["transformer_layer_cls_to_wrap"] = [self.fsdp_config["transformer_layer_cls_to_wrap"]]
if len(self.fsdp) == 0 and self.fsdp_config["min_num_params"] > 0:
warnings.warn("`min_num_params` is useful only when `--fsdp` is specified.")
if len(self.fsdp) == 0 and self.fsdp_config.get("transformer_layer_cls_to_wrap", None) is not None:
warnings.warn("`transformer_layer_cls_to_wrap` is useful only when `--fsdp` is specified.")
if (
len(self.fsdp) > 0
and self.fsdp_config["min_num_params"] > 0
and self.fsdp_config.get("transformer_layer_cls_to_wrap", None) is not None
):
raise ValueError("`min_num_params` and `transformer_layer_cls_to_wrap` are mutually exclusive.")
self.fsdp_config["xla"] = self.fsdp_config.get("xla", False)
self.fsdp_config["xla_fsdp_v2"] = self.fsdp_config.get("xla_fsdp_v2", False)
self.fsdp_config["xla_fsdp_grad_ckpt"] = self.fsdp_config.get("xla_fsdp_grad_ckpt", False)
if self.fsdp_config["xla"]:
if len(self.fsdp) > 0:
# store XLA fsdp configuration parameters into a dictionary
# Copy the config to avoid modifying the original config (which may be used for JSON serialization)
self.xla_fsdp_config = self.fsdp_config.get("xla_fsdp_settings", {}).copy()
# apply appropriate string to torch.dtype conversions for parameters
if "compute_dtype" in self.xla_fsdp_config:
self.xla_fsdp_config["compute_dtype"] = getattr(torch, self.xla_fsdp_config["compute_dtype"])
if "buffer_dtype" in self.xla_fsdp_config:
self.xla_fsdp_config["buffer_dtype"] = getattr(torch, self.xla_fsdp_config["buffer_dtype"])
else:
warnings.warn("XLA FSDP can be used only when `--fsdp` is specified.")
else:
if self.fsdp_config["xla_fsdp_grad_ckpt"]:
warnings.warn("`--xla_fsdp_grad_ckpt` is useful only when `--xla` is set to true.")
# accelerate integration for FSDP
if len(self.fsdp) > 0 and not self.fsdp_config["xla"]:
os.environ["ACCELERATE_USE_FSDP"] = "true"
from accelerate.utils.constants import (
FSDP_AUTO_WRAP_POLICY,
FSDP_SHARDING_STRATEGY,
)
prefix = "FSDP_"
for fsdp_option in self.fsdp:
if fsdp_option.upper() in FSDP_SHARDING_STRATEGY:
# set environment variable for FSDP sharding strategy
os.environ[f"{prefix}SHARDING_STRATEGY"] = str(
FSDP_SHARDING_STRATEGY.index(fsdp_option.upper()) + 1
)
elif fsdp_option == FSDPOption.OFFLOAD:
os.environ[f"{prefix}OFFLOAD_PARAMS"] = "true"
elif fsdp_option == FSDPOption.AUTO_WRAP:
os.environ[f"{prefix}AUTO_WRAP_POLICY"] = FSDP_AUTO_WRAP_POLICY[0]
if self.fsdp_config["min_num_params"] > 0:
os.environ[f"{prefix}MIN_NUM_PARAMS"] = str(self.fsdp_config["min_num_params"])
os.environ[f"{prefix}AUTO_WRAP_POLICY"] = FSDP_AUTO_WRAP_POLICY[1]
elif self.fsdp_config.get("transformer_layer_cls_to_wrap", None) is not None:
os.environ[f"{prefix}TRANSFORMER_CLS_TO_WRAP"] = ",".join(
self.fsdp_config["transformer_layer_cls_to_wrap"]
)
prefetch_policy = self.fsdp_config.get("backward_prefetch", "NO_PREFETCH")
os.environ[f"{prefix}BACKWARD_PREFETCH"] = prefetch_policy.upper()
os.environ[f"{prefix}FORWARD_PREFETCH"] = str(self.fsdp_config.get("forward_prefetch", "false")).lower()
sync_module_states = str(self.fsdp_config.get("sync_module_states", "true")).lower()
cpu_ram_efficient_loading = str(self.fsdp_config.get("cpu_ram_efficient_loading", "false")).lower()
if sync_module_states == "false" and cpu_ram_efficient_loading == "true":
# In this case, all the processes except the main process would have random weights leading
# to unexpected behaviour during training, thus throwing error here to prevent it.
raise ValueError('`sync_module_states` must be `"True"` if `cpu_ram_efficient_loading` is `"True"`')
os.environ[f"{prefix}SYNC_MODULE_STATES"] = sync_module_states
os.environ[f"{prefix}CPU_RAM_EFFICIENT_LOADING"] = cpu_ram_efficient_loading
os.environ[f"{prefix}USE_ORIG_PARAMS"] = str(self.fsdp_config.get("use_orig_params", "true")).lower()
if isinstance(self.debug, str):
self.debug = [DebugOption(s) for s in self.debug.split()]
elif self.debug is None:
self.debug = []
self.fsdp_plugin = None
fsdp_plugin_args = self._process_fsdp_args()
if fsdp_plugin_args is not None:
# Accelerate FSDP Plugin
from accelerate.utils import FullyShardedDataParallelPlugin
self.fsdp_plugin = FullyShardedDataParallelPlugin(**fsdp_plugin_args)
self.deepspeed_plugin = None
if self.deepspeed:
# - must be run very last in arg parsing, since it will use a lot of these settings.
@ -1679,13 +1790,15 @@ class TrainingArguments:
# Accelerate DeepSpeed Plugin
from accelerate.utils import DeepSpeedPlugin
os.environ["ACCELERATE_USE_DEEPSPEED"] = "true"
self.deepspeed_plugin = DeepSpeedPlugin(hf_ds_config=self.hf_deepspeed_config)
elif strtobool(os.environ.get("ACCELERATE_USE_DEEPSPEED", "false")):
# Accelerate DeepSpeed Plugin
from accelerate.utils import DeepSpeedPlugin
self.deepspeed_plugin = DeepSpeedPlugin()
self.deepspeed_plugin.set_mixed_precision(self.mixed_precision)
mixed_precision = os.environ.get("ACCELERATE_MIXED_PRECISION", "no")
self.deepspeed_plugin.set_mixed_precision(mixed_precision)
self.deepspeed_plugin.set_deepspeed_weakref()
if self.use_cpu:
@ -1766,6 +1879,8 @@ class TrainingArguments:
else:
AcceleratorState._reset_state(reset_partial_state=True)
self.distributed_state = None
if "ACCELERATE_USE_IPEX" not in os.environ:
os.environ["ACCELERATE_USE_IPEX"] = "false"
self._n_gpu = 1
if self.use_cpu or strtobool(os.environ.get("ACCELERATE_USE_CPU", "False")):
@ -2647,127 +2762,6 @@ class TrainingArguments:
self.data_seed = sampler_seed
return self
def _process_fsdp_args(self):
if self.fsdp is None:
self.fsdp = []
elif self.fsdp is True:
self.fsdp = [FSDPOption.FULL_SHARD]
elif isinstance(self.fsdp, str):
self.fsdp = [FSDPOption(s) for s in self.fsdp.split()]
if self.fsdp == [FSDPOption.OFFLOAD]:
raise ValueError(
"`--fsdp offload` can't work on its own. It needs to be added to `--fsdp full_shard` or "
'`--fsdp shard_grad_op`. For example, `--fsdp "full_shard offload"`.'
)
elif FSDPOption.FULL_SHARD in self.fsdp and FSDPOption.SHARD_GRAD_OP in self.fsdp:
raise ValueError("`--fsdp full_shard` is not compatible with `--fsdp shard_grad_op`.")
if self.gradient_checkpointing and (
FSDPOption.FULL_SHARD in self.fsdp or FSDPOption.HYBRID_SHARD in self.fsdp
):
logger.warning(
"When using FSDP full shard, instead of using `gradient_checkpointing` in TrainingArguments, please"
" use `activation_checkpointing` in `fsdp_config`. The former introduces a redundant AllGather"
" operation in backward pass. Reference: https://github.com/huggingface/transformers/issues/30404"
)
if self.fsdp_config is None:
self.fsdp_config = {}
if isinstance(self.fsdp_config, str):
if len(self.fsdp) == 0:
warnings.warn("`--fsdp_config` is useful only when `--fsdp` is specified.")
with open(self.fsdp_config, encoding="utf-8") as f:
self.fsdp_config = json.load(f)
if self.fsdp_config is not None and isinstance(self.fsdp_config, dict):
for k in list(self.fsdp_config.keys()):
if k.startswith("fsdp_"):
v = self.fsdp_config.pop(k)
self.fsdp_config[k[5:]] = v
self.fsdp_config["min_num_params"] = self.fsdp_config.get("min_num_params", 0)
# if fsdp_config["transformer_layer_cls_to_wrap"] is specified as a string, convert it to a list with a single object
if isinstance(self.fsdp_config.get("transformer_layer_cls_to_wrap", None), str):
self.fsdp_config["transformer_layer_cls_to_wrap"] = [self.fsdp_config["transformer_layer_cls_to_wrap"]]
if len(self.fsdp) == 0 and self.fsdp_config["min_num_params"] > 0:
warnings.warn("`min_num_params` is useful only when `--fsdp` is specified.")
if len(self.fsdp) == 0 and self.fsdp_config.get("transformer_layer_cls_to_wrap", None) is not None:
warnings.warn("`transformer_layer_cls_to_wrap` is useful only when `--fsdp` is specified.")
if (
len(self.fsdp) > 0
and self.fsdp_config["min_num_params"] > 0
and self.fsdp_config.get("transformer_layer_cls_to_wrap", None) is not None
):
raise ValueError("`min_num_params` and `transformer_layer_cls_to_wrap` are mutually exclusive.")
self.fsdp_config["xla"] = self.fsdp_config.get("xla", False)
self.fsdp_config["xla_fsdp_v2"] = self.fsdp_config.get("xla_fsdp_v2", False)
self.fsdp_config["xla_fsdp_grad_ckpt"] = self.fsdp_config.get("xla_fsdp_grad_ckpt", False)
if self.fsdp_config["xla"]:
if len(self.fsdp) > 0:
# store XLA fsdp configuration parameters into a dictionary
# Copy the config to avoid modifying the original config (which may be used for JSON serialization)
self.xla_fsdp_config = self.fsdp_config.get("xla_fsdp_settings", {}).copy()
# apply appropriate string to torch.dtype conversions for parameters
if "compute_dtype" in self.xla_fsdp_config:
self.xla_fsdp_config["compute_dtype"] = getattr(torch, self.xla_fsdp_config["compute_dtype"])
if "buffer_dtype" in self.xla_fsdp_config:
self.xla_fsdp_config["buffer_dtype"] = getattr(torch, self.xla_fsdp_config["buffer_dtype"])
else:
warnings.warn("XLA FSDP can be used only when `--fsdp` is specified.")
else:
if self.fsdp_config["xla_fsdp_grad_ckpt"]:
warnings.warn("`--xla_fsdp_grad_ckpt` is useful only when `--xla` is set to true.")
# accelerate integration for FSDP
fsdp_plugin_args = None
if len(self.fsdp) > 0 and not self.fsdp_config["xla"]:
from accelerate.utils.constants import (
FSDP_AUTO_WRAP_POLICY,
FSDP_SHARDING_STRATEGY,
)
fsdp_plugin_args = {}
for fsdp_option in self.fsdp:
if fsdp_option.upper() in FSDP_SHARDING_STRATEGY:
fsdp_plugin_args["sharding_strategy"] = fsdp_option
elif fsdp_option == FSDPOption.OFFLOAD:
fsdp_plugin_args["cpu_offload"] = True
elif fsdp_option == FSDPOption.AUTO_WRAP:
fsdp_plugin_args["auto_wrap_policy"] = FSDP_AUTO_WRAP_POLICY[0]
if self.fsdp_config["min_num_params"] > 0:
fsdp_plugin_args["min_num_params"] = self.fsdp_config["min_num_params"]
fsdp_plugin_args["auto_wrap_policy"] = FSDP_AUTO_WRAP_POLICY[1]
elif self.fsdp_config.get("transformer_layer_cls_to_wrap", None) is not None:
fsdp_plugin_args["transformer_cls_names_to_wrap"] = ",".join(
self.fsdp_config["transformer_layer_cls_to_wrap"]
)
fsdp_plugin_args["fsdp_version"] = self.fsdp_config.get("fsdp_version", 1)
prefetch_policy = self.fsdp_config.get("backward_prefetch", "NO_PREFETCH")
fsdp_plugin_args["backward_prefetch"] = prefetch_policy.upper()
fsdp_plugin_args["forward_prefetch"] = str(self.fsdp_config.get("forward_prefetch", "false")).lower()
sync_module_states = str(self.fsdp_config.get("sync_module_states", "true")).lower()
cpu_ram_efficient_loading = str(self.fsdp_config.get("cpu_ram_efficient_loading", "false")).lower()
if sync_module_states == "false" and cpu_ram_efficient_loading == "true":
# In this case, all the processes except the main process would have random weights leading
# to unexpected behaviour during training, thus throwing error here to prevent it.
raise ValueError('`sync_module_states` must be `"True"` if `cpu_ram_efficient_loading` is `"True"`')
# we need to set the env here as otherwise we get a warning in accelerate + we need to set it for transformers
fsdp_plugin_args["cpu_ram_efficient_loading"] = cpu_ram_efficient_loading
os.environ["FSDP_CPU_RAM_EFFICIENT_LOADING"] = cpu_ram_efficient_loading
fsdp_plugin_args["sync_module_states"] = sync_module_states
fsdp_plugin_args["use_orig_params"] = str(self.fsdp_config.get("use_orig_params", "true")).lower()
return fsdp_plugin_args
class ParallelMode(Enum):
NOT_PARALLEL = "not_parallel"

View File

@ -2161,7 +2161,7 @@ def create_import_structure_from_path(module_path):
{
'albert': {
frozenset(): {
'configuration_albert': {'AlbertConfig'}
'configuration_albert': {'AlbertConfig', 'AlbertOnnxConfig'}
},
frozenset({'tokenizers'}): {
'tokenization_albert_fast': {'AlbertTokenizerFast'}
@ -2337,7 +2337,7 @@ def spread_import_structure(nested_import_structure):
{
'albert': {
frozenset(): {
'configuration_albert': {'AlbertConfig'}
'configuration_albert': {'AlbertConfig', 'AlbertOnnxConfig'}
},
frozenset({'tokenizers'}): {
'tokenization_albert_fast': {'AlbertTokenizerFast'}
@ -2364,7 +2364,7 @@ def spread_import_structure(nested_import_structure):
'albert.tokenization_albert_fast': {'AlbertTokenizerFast'}
},
frozenset(): {
'albert.configuration_albert': {'AlbertConfig'},
'albert.configuration_albert': {'AlbertConfig', 'AlbertOnnxConfig'},
'align.processing_align': {'AlignProcessor'},
'align.configuration_align': {'AlignConfig', 'AlignTextConfig', 'AlignVisionConfig'},
'altclip.configuration_altclip': {'AltCLIPConfig', 'AltCLIPTextConfig', 'AltCLIPVisionConfig'},
@ -2465,7 +2465,7 @@ def define_import_structure(module_path: str, prefix: str | None = None) -> IMPO
'albert.tokenization_albert_fast': {'AlbertTokenizerFast'}
},
frozenset(): {
'albert.configuration_albert': {'AlbertConfig'},
'albert.configuration_albert': {'AlbertConfig', 'AlbertOnnxConfig'},
'align.processing_align': {'AlignProcessor'},
'align.configuration_align': {'AlignConfig', 'AlignTextConfig', 'AlignVisionConfig'},
'altclip.configuration_altclip': {'AltCLIPConfig', 'AltCLIPTextConfig', 'AltCLIPVisionConfig'},

View File

@ -0,0 +1,488 @@
import re
from typing import Optional, Union
import matplotlib.pyplot as plt
import numpy as np
import requests
from PIL import Image
from ..image_transforms import convert_to_rgb, to_pil_image, unnormalize
from ..image_utils import ChannelDimension, infer_channel_dimension_format
from ..models.auto import AutoConfig, AutoProcessor
# archs failing that should raise immediately for this util:
INCOMPATIBLE_MODELS = [
"bit",
"colpali",
"colqwen2",
"convnext",
"d_fine",
"data2vec",
"efficientloftr",
"efficientnet",
"fuyu",
"gemma3",
"glm4v",
"glpn",
"hgnet_v2",
"hiera",
"internvl",
"janus",
"layoutlmv3",
"levit",
"lightglue",
"llama4",
"mistral3",
"mllama",
"mobilevit",
"mobilevitv2",
"musicgen",
"musicgen_melody",
"oneformer",
"perceiver",
"perception_lm",
"phi4_multimodal",
"regnet",
"resnet",
"superglue",
"superpoint",
"swin2sr",
"timm_wrapper",
"tvp",
"udop",
"vitmatte",
"vitpose",
"vjepa2",
"whisper",
]
DEFAULT_IMAGE_URL = (
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/hf-logo-224x224.png"
)
def _looks_like_global(tile: np.ndarray, base: Image.Image, *, mae_tol: float = 0.05) -> bool:
"""
Heuristic to check if a tile is a downscaled version of the original image.
Uses mean absolute error with a strict threshold.
"""
base_r = base.convert("RGB").resize(tile.shape[:2][::-1], Image.BILINEAR)
base_np = np.asarray(base_r).astype(np.float32) / 255.0
tile_f32 = tile.astype(np.float32)
if tile_f32.max() > 1.5:
tile_f32 /= 255.0
mae = np.abs(tile_f32 - base_np).mean()
return mae < mae_tol
def _find_global_tile_index(tiles: np.ndarray, base: Image.Image) -> Optional[int]:
"""
Find which tile (if any) is the global/downscaled image.
Checks first and last tiles only, as models place global images at these positions.
Returns:
Index of global tile (0 or len-1), or None if not found
"""
if tiles.shape[0] <= 1:
return None
if _looks_like_global(tiles[0], base):
return 0
if _looks_like_global(tiles[-1], base):
return tiles.shape[0] - 1
return None
class ImageVisualizer:
def __init__(self, repo_id: str):
self.processor = AutoProcessor.from_pretrained(repo_id, trust_remote_code=False)
self.config = AutoConfig.from_pretrained(repo_id, trust_remote_code=False)
if hasattr(self.processor, "image_processor"):
image_processor = self.processor.image_processor
elif hasattr(self.processor, "image_mean"):
image_processor = self.processor # weak test, but works most of the time
else:
raise ValueError(f"No image processor found for {repo_id}.")
self.channel_means = getattr(image_processor, "image_mean", [0.485, 0.456, 0.406])
self.channel_stds = getattr(image_processor, "image_std", [0.229, 0.224, 0.225])
if hasattr(self.processor, "image_token"):
self.image_token_marker = self.processor.image_token
elif hasattr(self.processor, "image_token_id"):
self.image_token_marker = self.processor.decode(self.processor.image_token_id)
else:
self.image_token_marker = "<image>"
self.default_prompt = f"{self.image_token_marker} How does it look?"
self.vision_config = getattr(self.config, "vision_config", None)
self.patch_size = getattr(self.vision_config, "patch_size", getattr(image_processor, "patch_size", 14))
self.merge_size = getattr(image_processor, "merge_size", 1)
def _prepare_images_for_display(self, image_array: np.ndarray) -> np.ndarray:
"""
Convert unnormalized images to NHWC format for display, flattening any extra batch dimensions.
Args:
image_array: Array of shape [..., C, H, W] or [..., H, W, C]
Returns:
Array of shape [N, H, W, C] suitable for plotting
"""
input_format = infer_channel_dimension_format(image_array)
if input_format == ChannelDimension.FIRST:
if image_array.ndim == 3:
image_array = image_array[np.newaxis, ...]
elif image_array.ndim > 4:
batch_size = int(np.prod(image_array.shape[: image_array.ndim - 3]))
num_channels, height, width = image_array.shape[-3:]
image_array = image_array.reshape(batch_size, num_channels, height, width)
if image_array.ndim == 4:
image_array = np.transpose(image_array, (0, 2, 3, 1))
else:
if image_array.ndim == 3:
image_array = image_array[np.newaxis, ...]
elif image_array.ndim > 4:
batch_size = int(np.prod(image_array.shape[: image_array.ndim - 3]))
height, width, num_channels = image_array.shape[-3:]
image_array = image_array.reshape(batch_size, height, width, num_channels)
return image_array
def _display_single_image(
self,
image_array: np.ndarray,
show_patch_grid: bool,
figsize=(7, 7),
patch_grid_rows=None,
patch_grid_cols=None,
):
plt.figure(figsize=figsize)
plt.imshow(image_array)
plt.xticks([])
plt.yticks([])
if show_patch_grid:
height, width = image_array.shape[:2]
if patch_grid_rows is not None and patch_grid_cols is not None:
step_h = height / patch_grid_rows
step_w = width / patch_grid_cols
for i in range(1, patch_grid_cols):
plt.axvline(i * step_w, color="red", linewidth=0.5)
for i in range(1, patch_grid_rows):
plt.axhline(i * step_h, color="red", linewidth=0.5)
else:
step = max(1, min(height, width) // self.patch_size)
for x_pos in range(0, width, step):
plt.axvline(x_pos, color="red", linewidth=0.5)
for y_pos in range(0, height, step):
plt.axhline(y_pos, color="red", linewidth=0.5)
caption = f"{width}×{height} | mean={', '.join(f'{m:.3f}' for m in self.channel_means)} | std={', '.join(f'{s:.3f}' for s in self.channel_stds)}"
plt.tight_layout()
plt.figtext(0.5, -0.02, caption, ha="center", va="top", fontsize=12)
plt.show()
def _display_tiled_images(
self,
tiles_array: np.ndarray,
source_image: Image.Image,
rows: Optional[int] = None,
cols: Optional[int] = None,
aspect_ratio: float = 1.0,
add_grid: bool = True,
figsize=(7, 7),
global_tile: Optional[np.ndarray] = None,
):
"""
Display a grid of image tiles with optional global image display.
Args:
tiles_array: Array of tiles to display in grid format
source_image: Original source image
rows: Number of rows in the grid
cols: Number of columns in the grid
aspect_ratio: Aspect ratio for grid layout calculation
add_grid: Whether to add patch grid overlay
figsize: Figure size for matplotlib
global_tile: Optional global/downscaled image to display separately
"""
num_tiles = tiles_array.shape[0]
# Infer grid if not specified
grid_rows, grid_cols = rows, cols
if grid_rows is None or grid_cols is None:
if aspect_ratio >= 1:
guessed_cols = int(np.ceil(np.sqrt(num_tiles * aspect_ratio)))
guessed_rows = int(np.ceil(num_tiles / max(guessed_cols, 1)))
else:
guessed_rows = int(np.ceil(np.sqrt(num_tiles / max(aspect_ratio, 1e-8))))
guessed_cols = int(np.ceil(num_tiles / max(guessed_rows, 1)))
grid_rows = grid_rows if grid_rows is not None else guessed_rows
grid_cols = grid_cols if grid_cols is not None else guessed_cols
fig, axes = plt.subplots(grid_rows, grid_cols, figsize=figsize, squeeze=False)
tile_index = 0
for row_idx in range(grid_rows):
for col_idx in range(grid_cols):
ax = axes[row_idx, col_idx]
if tile_index < num_tiles:
tile_image = tiles_array[tile_index]
ax.imshow(tile_image)
ax.set_xticks([])
ax.set_yticks([])
if add_grid:
height, width = tile_image.shape[:2]
step = max(1, min(height, width) // self.patch_size)
for x_pos in range(0, width, step):
ax.axvline(x_pos, color="red", linewidth=0.5)
for y_pos in range(0, height, step):
ax.axhline(y_pos, color="red", linewidth=0.5)
else:
ax.axis("off")
tile_index += 1
unique = sorted({f"{t.shape[1]}×{t.shape[0]}" for t in tiles_array})
sizes = ", ".join(unique)
caption = f"{tiles_array.shape[0]} patches | {sizes} | mean={', '.join(f'{m:.3f}' for m in self.channel_means)} | std={', '.join(f'{s:.3f}' for s in self.channel_stds)}"
plt.tight_layout()
fig.text(0.5, 0.02, caption, ha="center", va="bottom", fontsize=12)
plt.show()
if global_tile is not None:
fig2, ax2 = plt.subplots(figsize=figsize)
ax2.imshow(global_tile)
ax2.set_xticks([])
ax2.set_yticks([])
ax2.set_aspect("equal", adjustable="box")
fig2.subplots_adjust(left=0, right=1, top=1, bottom=0)
h0, w0 = global_tile.shape[:2]
caption = f"Global: {w0}×{h0} | mean={', '.join(f'{m:.3f}' for m in self.channel_means)} | std={', '.join(f'{s:.3f}' for s in self.channel_stds)}"
fig2.text(0.5, 0.02, caption, ha="center", va="bottom", fontsize=12)
plt.show()
def default_message(self, full_output: bool = False) -> str:
"""
Build a single formatted prompt string using the processor's chat template.
Contains one image (HF logo) and one user text message.
If available, adds the generation prompt as well.
Falls back to a minimal '<image>' string if no template is available.
"""
# ensure this is a multimodal processor with image + tokenizer
if not (
hasattr(self.processor, "attributes")
and "image_processor" in self.processor.attributes
and "tokenizer" in self.processor.attributes
):
raise RuntimeError(
"Processor does not expose both 'image_processor' and 'tokenizer'; cannot build multimodal example."
)
conversation = [
{
"role": "user",
"content": [
{
"type": "image",
"url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/hf-logo-224x224.png",
},
{"type": "text", "text": "Please describe this image."},
],
}
]
try:
print("For a 224x224 RGB png image: \n")
decoded_message = self.processor.batch_decode(
self.processor.apply_chat_template(
conversation,
add_generation_prompt=True,
tokenize=True,
return_dict=False,
truncation=False,
),
skip_special_tokens=False,
)[0]
image_token_string = getattr(self.processor, "image_token", "<image>")
token_escaped = re.escape(image_token_string)
image_token_run_pattern = re.compile(rf"(?:{token_escaped})(?:\s*{token_escaped}){{2,}}")
def compress_image_token_run(match: re.Match) -> str:
n_tokens = match.group(0).count(image_token_string)
return f"{image_token_string}[...{n_tokens} tokens...]{image_token_string}"
if full_output:
return decoded_message
else:
return image_token_run_pattern.sub(compress_image_token_run, decoded_message)
except ValueError:
image_token_string = getattr(
self.processor,
"image_token",
getattr(getattr(self.processor, "tokenizer", None), "image_token", "<image>"),
)
return f"{image_token_string} {'Please describe this image.'}"
def visualize(
self,
images: Optional[Union[Image.Image, np.ndarray, str, list[Union[Image.Image, np.ndarray, str]]]] = None,
rows: Optional[int] = None,
cols: Optional[int] = None,
add_grid: bool = True,
figsize=(12, 12),
):
"""
Visualize the model-processed image(s). Only single images are supported.
If the processor returns multiple tiles, display them in a grid with optional patch grid overlay.
"""
if images is None:
images = Image.open(requests.get(DEFAULT_IMAGE_URL, stream=True).raw)
if not isinstance(images, list):
images = [images]
else:
if len(images) > 1:
raise ValueError(
"You passed a list of several images. Only single images are accepted by the visualizer."
)
pil_images = [convert_to_rgb(to_pil_image(x)) for x in images]
img_width, img_height = pil_images[0].size
aspect_ratio = img_width / max(img_height, 1)
processed_inputs = self.processor(images=pil_images, text=self.default_prompt, return_tensors="pt")
pixel_values = processed_inputs["pixel_values"]
grid_rows = None
grid_cols = None
patch_grid_rows = None
patch_grid_cols = None
if hasattr(self.processor, "image_processor") and hasattr(
self.processor.image_processor, "get_num_patches_from_image_size"
):
num_patches, grid_rows, grid_cols = self.processor.image_processor.get_num_patches_from_image_size(
img_width, img_height
)
if pixel_values.ndim == 2 and "image_grid_thw" in processed_inputs:
num_patches, flattened_size = pixel_values.shape
grid_thw = processed_inputs["image_grid_thw"][0]
temporal_frames, patch_grid_h, patch_grid_w = grid_thw.tolist()
patch_size = getattr(self.processor.image_processor, "patch_size", 14)
temporal_patch_size = getattr(self.processor.image_processor, "temporal_patch_size", 1)
merge_size = getattr(self.processor.image_processor, "merge_size", 2)
expected_size = temporal_patch_size * 3 * patch_size * patch_size
if flattened_size == expected_size:
pixel_values = pixel_values.reshape(num_patches, temporal_patch_size, 3, patch_size, patch_size)
pixel_values = pixel_values[:, 0, :, :, :]
super_grid_h = patch_grid_h // merge_size
super_grid_w = patch_grid_w // merge_size
pixel_values = pixel_values.reshape(
super_grid_h, super_grid_w, merge_size, merge_size, 3, patch_size, patch_size
)
pixel_values = pixel_values.permute(0, 2, 1, 3, 4, 5, 6).contiguous()
pixel_values = pixel_values.reshape(
super_grid_h * merge_size, super_grid_w * merge_size, 3, patch_size, patch_size
)
pixel_values = pixel_values.permute(0, 3, 1, 4, 2).contiguous()
pixel_values = pixel_values.reshape(patch_grid_h * patch_size, patch_grid_w * patch_size, 3)
pixel_values = pixel_values.unsqueeze(0)
patch_grid_rows = patch_grid_h
patch_grid_cols = patch_grid_w
else:
raise ValueError(
f"Cannot reshape pixel_values: expected flattened size {expected_size} "
f"(temporal={temporal_patch_size} × channels=3 × patch={patch_size}×{patch_size}), "
f"but got {flattened_size}"
)
elif pixel_values.ndim == 5:
batch_size, num_tiles, num_channels, height, width = pixel_values.shape
pixel_values = pixel_values.view(batch_size * num_tiles, num_channels, height, width)
elif pixel_values.ndim == 4:
pass
elif pixel_values.ndim == 3:
pixel_values = pixel_values.unsqueeze(0)
else:
raise ValueError(f"Unexpected pixel_values shape: {pixel_values.shape}")
unnormalized = unnormalize(pixel_values, mean=self.channel_means, std=self.channel_stds)
display_ready = self._prepare_images_for_display(unnormalized)
if display_ready.shape[0] == 1:
self._display_single_image(
display_ready[0],
show_patch_grid=add_grid,
figsize=figsize,
patch_grid_rows=patch_grid_rows,
patch_grid_cols=patch_grid_cols,
)
return
num_tiles = display_ready.shape[0]
global_tile = None
if grid_rows is not None and grid_cols is not None and grid_rows * grid_cols + 1 == num_tiles:
global_tile = display_ready[-1]
display_ready = display_ready[:-1]
num_tiles = display_ready.shape[0]
if rows is None:
rows = grid_rows
if cols is None:
cols = grid_cols
else:
global_idx = _find_global_tile_index(display_ready, pil_images[0])
if global_idx is not None:
global_tile = display_ready[global_idx]
if global_idx == 0:
display_ready = display_ready[1:]
else:
display_ready = display_ready[:-1]
num_tiles = display_ready.shape[0]
if rows is None or cols is None:
tile_h, tile_w = display_ready.shape[1:3]
tile_aspect = tile_w / tile_h if tile_h > 0 else 1.0
target_aspect = aspect_ratio / tile_aspect
best_rows, best_cols = 1, num_tiles
min_diff = float("inf")
for r in range(1, num_tiles + 1):
c = int(np.ceil(num_tiles / r))
diff = abs((c / r) - target_aspect)
if diff < min_diff:
min_diff = diff
best_rows, best_cols = r, c
rows = best_rows
cols = best_cols
self._display_tiled_images(
display_ready,
pil_images[0],
rows=rows,
cols=cols,
aspect_ratio=aspect_ratio,
add_grid=add_grid,
figsize=figsize,
global_tile=global_tile,
)

View File

@ -191,6 +191,7 @@ class TrainerIntegrationFSDP(TestCasePlus, TrainerIntegrationCommon):
for k, v in trainer.args.fsdp_config.items():
self.assertTrue(k in self.accelerate_fsdp_config)
self.assertEqual(v, self.accelerate_fsdp_config[k])
self.assertEqual(os.environ.get("ACCELERATE_USE_FSDP", "false"), "true")
@parameterized.expand(params, name_func=_parameterized_custom_name_func)
def test_fsdp_config(self, sharding_strategy, dtype):
@ -211,9 +212,10 @@ class TrainerIntegrationFSDP(TestCasePlus, TrainerIntegrationCommon):
self.assertEqual(trainer.args.fsdp[2], FSDPOption.AUTO_WRAP)
for k, v in trainer.args.fsdp_config.items():
self.assertEqual(v, self.fsdp_config[k])
self.assertEqual(os.environ.get("ACCELERATE_USE_FSDP", "false"), "true")
@parameterized.expand(params, name_func=_parameterized_custom_name_func)
def test_fsdp_plugin(self, sharding_strategy, dtype):
def test_fsdp_config_transformers_auto_wrap(self, sharding_strategy, dtype):
output_dir = self.get_auto_remove_tmp_dir()
fsdp_config = deepcopy(self.fsdp_config)
del fsdp_config["min_num_params"]
@ -227,25 +229,27 @@ class TrainerIntegrationFSDP(TestCasePlus, TrainerIntegrationCommon):
"fsdp_config": fsdp_config,
}
kwargs[dtype] = True
prefix = "FSDP_"
with mockenv_context(**self.dist_env_1_gpu):
trainer = get_regression_trainer(**kwargs)
self.assertEqual(trainer.args.fsdp[0], sharding_strategy)
self.assertEqual(trainer.args.fsdp[1], FSDPOption.OFFLOAD)
self.assertEqual(trainer.args.fsdp[2], FSDPOption.AUTO_WRAP)
fsdp_sharding_strategy = str(FSDP_SHARDING_STRATEGY.index(sharding_strategy.upper()) + 1)
self.assertEqual(os.environ[f"{prefix}SHARDING_STRATEGY"], fsdp_sharding_strategy)
self.assertEqual(os.environ[f"{prefix}OFFLOAD_PARAMS"], "true")
self.assertEqual(os.environ[f"{prefix}AUTO_WRAP_POLICY"], "TRANSFORMER_BASED_WRAP")
self.assertEqual(
trainer.args.fsdp_plugin.sharding_strategy.value,
FSDP_SHARDING_STRATEGY.index(sharding_strategy.upper()) + 1,
os.environ[f"{prefix}TRANSFORMER_CLS_TO_WRAP"], ",".join(fsdp_config["transformer_layer_cls_to_wrap"])
)
self.assertEqual(trainer.args.fsdp_plugin.cpu_offload.offload_params, True)
self.assertEqual(os.environ[f"{prefix}BACKWARD_PREFETCH"], fsdp_config["backward_prefetch"])
self.assertEqual(os.environ[f"{prefix}FORWARD_PREFETCH"], fsdp_config["forward_prefetch"])
self.assertEqual(os.environ[f"{prefix}USE_ORIG_PARAMS"], fsdp_config["use_orig_params"])
self.assertEqual(os.environ[f"{prefix}SYNC_MODULE_STATES"], fsdp_config["sync_module_states"])
self.assertEqual(
trainer.args.fsdp_plugin.transformer_cls_names_to_wrap,
fsdp_config["transformer_layer_cls_to_wrap"],
)
self.assertEqual(trainer.args.fsdp_plugin.forward_prefetch, fsdp_config["forward_prefetch"])
self.assertEqual(trainer.args.fsdp_plugin.sync_module_states, fsdp_config["sync_module_states"])
self.assertEqual(
trainer.args.fsdp_plugin.cpu_ram_efficient_loading, fsdp_config["cpu_ram_efficient_loading"]
os.environ[f"{prefix}CPU_RAM_EFFICIENT_LOADING"], fsdp_config["cpu_ram_efficient_loading"]
)
self.assertEqual(os.environ.get("ACCELERATE_USE_FSDP", "false"), "true")
@parameterized.expand(params, name_func=_parameterized_custom_name_func)
@require_torch_multi_accelerator

View File

@ -132,7 +132,7 @@ class TestImportStructures(unittest.TestCase):
"models": {
frozenset(): {"dummy_config": {"DummyConfig"}},
"albert": {
frozenset(): {"configuration_albert": {"AlbertConfig"}},
frozenset(): {"configuration_albert": {"AlbertConfig", "AlbertOnnxConfig"}},
frozenset({"torch"}): {
"modeling_albert": {
"AlbertForMaskedLM",
@ -174,7 +174,7 @@ class TestImportStructures(unittest.TestCase):
frozenset(): {
"dummy_non_model": {"DummyObject"},
"models.dummy_config": {"DummyConfig"},
"models.albert.configuration_albert": {"AlbertConfig"},
"models.albert.configuration_albert": {"AlbertConfig", "AlbertOnnxConfig"},
"models.llama.configuration_llama": {"LlamaConfig"},
"models.deprecated.transfo_xl.configuration_transfo_xl": {"TransfoXLConfig"},
"models.deprecated.transfo_xl.tokenization_transfo_xl": {"TransfoXLCorpus", "TransfoXLTokenizer"},

View File

@ -1045,6 +1045,7 @@ def ignore_undocumented(name: str) -> bool:
or name.endswith("Layer")
or name.endswith("Embeddings")
or name.endswith("Attention")
or name.endswith("OnnxConfig")
):
return True
# Submodules are not documented.

View File

@ -738,6 +738,7 @@ src/transformers/models/yoso/convert_yoso_pytorch_to_pytorch.py
src/transformers/models/yoso/modeling_yoso.py
src/transformers/models/zamba/configuration_zamba.py
src/transformers/models/zamba/modeling_zamba.py
src/transformers/onnx/config.py
src/transformers/optimization.py
src/transformers/pipelines/audio_classification.py
src/transformers/pipelines/audio_utils.py