Standardize PretrainedConfig to PreTrainedConfig (#41300)

* replace

* add metaclass for full BC

* doc

* consistency

* update deprecation message

* revert
This commit is contained in:
Cyril Vallez
2025-10-06 11:34:02 +02:00
committed by GitHub
parent 55b172b8eb
commit 163601c619
544 changed files with 2819 additions and 2813 deletions

View File

@ -52,7 +52,7 @@
<figcaption class="mt-2 text-center text-sm text-gray-500">الصورة توضح مخطط مراحل نموذج Swin.</figcaption> <figcaption class="mt-2 text-center text-sm text-gray-500">الصورة توضح مخطط مراحل نموذج Swin.</figcaption>
</div> </div>
يسمح لك [`AutoBackbone`] باستخدام النماذج المُدربة مسبقًا كعمود فقري للحصول على خرائط ميزات من مراحل مختلفة من العمود الفقري. يجب عليك تحديد أحد المعلمات التالية في [`~PretrainedConfig.from_pretrained`]: يسمح لك [`AutoBackbone`] باستخدام النماذج المُدربة مسبقًا كعمود فقري للحصول على خرائط ميزات من مراحل مختلفة من العمود الفقري. يجب عليك تحديد أحد المعلمات التالية في [`~PreTrainedConfig.from_pretrained`]:
* `out_indices` هو فهرس الطبقة التي تريد الحصول على خريطة الميزات منها * `out_indices` هو فهرس الطبقة التي تريد الحصول على خريطة الميزات منها
* `out_features` هو اسم الطبقة التي تريد الحصول على خريطة الميزات منها * `out_features` هو اسم الطبقة التي تريد الحصول على خريطة الميزات منها

View File

@ -54,19 +54,19 @@ DistilBertConfig {
``` ```
يمكن تعديل خصائص النموذج المدرب مسبقًا في دالة [`~PretrainedConfig.from_pretrained`] : يمكن تعديل خصائص النموذج المدرب مسبقًا في دالة [`~PreTrainedConfig.from_pretrained`] :
```py ```py
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4) >>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
``` ```
بمجرد أن تصبح راضيًا عن تكوين نموذجك، يمكنك حفظه باستخدام [`~PretrainedConfig.save_pretrained`]. يتم تخزين ملف التكوين الخاص بك على أنه ملف JSON في دليل الحفظ المحدد: بمجرد أن تصبح راضيًا عن تكوين نموذجك، يمكنك حفظه باستخدام [`~PreTrainedConfig.save_pretrained`]. يتم تخزين ملف التكوين الخاص بك على أنه ملف JSON في دليل الحفظ المحدد:
```py ```py
>>> my_config.save_pretrained(save_directory="./your_model_save_path") >>> my_config.save_pretrained(save_directory="./your_model_save_path")
``` ```
لإعادة استخدام ملف التكوين، قم بتحميله باستخدام [`~PretrainedConfig.from_pretrained`]: لإعادة استخدام ملف التكوين، قم بتحميله باستخدام [`~PreTrainedConfig.from_pretrained`]:
```py ```py
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json") >>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")

View File

@ -20,11 +20,11 @@
في مثالنا، سنعدّل بعض الوسائط في فئة ResNet التي قد نرغب في ضبطها. ستعطينا التكوينات المختلفة أنواع ResNets المختلفة الممكنة. سنقوم بتخزين هذه الوسائط بعد التحقق من صحته. في مثالنا، سنعدّل بعض الوسائط في فئة ResNet التي قد نرغب في ضبطها. ستعطينا التكوينات المختلفة أنواع ResNets المختلفة الممكنة. سنقوم بتخزين هذه الوسائط بعد التحقق من صحته.
```python ```python
from transformers import PretrainedConfig from transformers import PreTrainedConfig
from typing import List from typing import List
class ResnetConfig(PretrainedConfig): class ResnetConfig(PreTrainedConfig):
model_type = "resnet" model_type = "resnet"
def __init__( def __init__(
@ -58,11 +58,11 @@ class ResnetConfig(PretrainedConfig):
``` ```
الأشياء الثلاثة المهمة التي يجب تذكرها عند كتابة تكوينك الخاص هي: الأشياء الثلاثة المهمة التي يجب تذكرها عند كتابة تكوينك الخاص هي:
- يجب أن ترث من `PretrainedConfig`، - يجب أن ترث من `PreTrainedConfig`،
- يجب أن تقبل دالة `__init__` الخاصة بـ `PretrainedConfig` أي معامﻻت إضافية kwargs، - يجب أن تقبل دالة `__init__` الخاصة بـ `PreTrainedConfig` أي معامﻻت إضافية kwargs،
- يجب تمرير هذه المعامﻻت الإضافية إلى دالة `__init__` فى الفئة الأساسية الاعلى. - يجب تمرير هذه المعامﻻت الإضافية إلى دالة `__init__` فى الفئة الأساسية الاعلى.
يضمن الإرث حصولك على جميع الوظائف من مكتبة 🤗 Transformers، في حين أن القيدين التانى والثالث يأتيان من حقيقة أن `PretrainedConfig` لديه المزيد من الحقول أكثر من تلك التي تقوم بتعيينها. عند إعادة تحميل تكوين باستخدام طريقة `from_pretrained`، يجب أن يقبل تكوينك هذه الحقول ثم إرسالها إلى الفئة الأساسية الأعلى. يضمن الإرث حصولك على جميع الوظائف من مكتبة 🤗 Transformers، في حين أن القيدين التانى والثالث يأتيان من حقيقة أن `PreTrainedConfig` لديه المزيد من الحقول أكثر من تلك التي تقوم بتعيينها. عند إعادة تحميل تكوين باستخدام طريقة `from_pretrained`، يجب أن يقبل تكوينك هذه الحقول ثم إرسالها إلى الفئة الأساسية الأعلى.
تحديد `model_type` لتكوينك (هنا `model_type="resnet"`) ليس إلزاميًا، ما لم ترغب في تحديد `model_type` لتكوينك (هنا `model_type="resnet"`) ليس إلزاميًا، ما لم ترغب في
تسجيل نموذجك باستخدام الفئات التلقائية (راجع القسم الأخير). تسجيل نموذجك باستخدام الفئات التلقائية (راجع القسم الأخير).
@ -82,7 +82,7 @@ resnet50d_config.save_pretrained("custom-resnet")
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet") resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
``` ```
يمكنك أيضًا استخدام أي طريقة أخرى من فئة [`PretrainedConfig`]، مثل [`~PretrainedConfig.push_to_hub`] لتحميل تكوينك مباشرة إلى Hub. يمكنك أيضًا استخدام أي طريقة أخرى من فئة [`PreTrainedConfig`]، مثل [`~PreTrainedConfig.push_to_hub`] لتحميل تكوينك مباشرة إلى Hub.
## كتابة نموذج مخصص ## كتابة نموذج مخصص

View File

@ -53,7 +53,7 @@ Lassen Sie uns daher ein wenig tiefer in das allgemeine Design der Bibliothek ei
### Überblick über die Modelle ### Überblick über die Modelle
Um ein Modell erfolgreich hinzuzufügen, ist es wichtig, die Interaktion zwischen Ihrem Modell und seiner Konfiguration zu verstehen, Um ein Modell erfolgreich hinzuzufügen, ist es wichtig, die Interaktion zwischen Ihrem Modell und seiner Konfiguration zu verstehen,
[`PreTrainedModel`] und [`PretrainedConfig`]. Als Beispiel werden wir [`PreTrainedModel`] und [`PreTrainedConfig`]. Als Beispiel werden wir
das Modell, das zu 🤗 Transformers hinzugefügt werden soll, `BrandNewBert` nennen. das Modell, das zu 🤗 Transformers hinzugefügt werden soll, `BrandNewBert` nennen.
Schauen wir uns das mal an: Schauen wir uns das mal an:
@ -81,10 +81,10 @@ model.config # model has access to its config
``` ```
Ähnlich wie das Modell erbt die Konfiguration grundlegende Serialisierungs- und Deserialisierungsfunktionalitäten von Ähnlich wie das Modell erbt die Konfiguration grundlegende Serialisierungs- und Deserialisierungsfunktionalitäten von
[`PretrainedConfig`]. Beachten Sie, dass die Konfiguration und das Modell immer in zwei verschiedene Formate serialisiert werden [`PreTrainedConfig`]. Beachten Sie, dass die Konfiguration und das Modell immer in zwei verschiedene Formate serialisiert werden
unterschiedliche Formate serialisiert werden - das Modell in eine *pytorch_model.bin* Datei und die Konfiguration in eine *config.json* Datei. Aufruf von unterschiedliche Formate serialisiert werden - das Modell in eine *pytorch_model.bin* Datei und die Konfiguration in eine *config.json* Datei. Aufruf von
[`~PreTrainedModel.save_pretrained`] wird automatisch [`~PreTrainedModel.save_pretrained`] wird automatisch
[`~PretrainedConfig.save_pretrained`] auf, so dass sowohl das Modell als auch die Konfiguration gespeichert werden. [`~PreTrainedConfig.save_pretrained`] auf, so dass sowohl das Modell als auch die Konfiguration gespeichert werden.
### Code-Stil ### Code-Stil

View File

@ -51,7 +51,7 @@ This section describes how the model and configuration classes interact and the
### Model and configuration ### Model and configuration
All Transformers' models inherit from a base [`PreTrainedModel`] and [`PretrainedConfig`] class. The configuration is the models blueprint. All Transformers' models inherit from a base [`PreTrainedModel`] and [`PreTrainedConfig`] class. The configuration is the models blueprint.
There is never more than two levels of abstraction for any model to keep the code readable. The example model here, BrandNewLlama, inherits from `BrandNewLlamaPreTrainedModel` and [`PreTrainedModel`]. It is important that a new model only depends on [`PreTrainedModel`] so that it can use the [`~PreTrainedModel.from_pretrained`] and [`~PreTrainedModel.save_pretrained`] methods. There is never more than two levels of abstraction for any model to keep the code readable. The example model here, BrandNewLlama, inherits from `BrandNewLlamaPreTrainedModel` and [`PreTrainedModel`]. It is important that a new model only depends on [`PreTrainedModel`] so that it can use the [`~PreTrainedModel.from_pretrained`] and [`~PreTrainedModel.save_pretrained`] methods.
@ -66,9 +66,9 @@ model = BrandNewLlamaModel.from_pretrained("username/brand_new_llama")
model.config model.config
``` ```
[`PretrainedConfig`] provides the [`~PretrainedConfig.from_pretrained`] and [`~PretrainedConfig.save_pretrained`] methods. [`PreTrainedConfig`] provides the [`~PreTrainedConfig.from_pretrained`] and [`~PreTrainedConfig.save_pretrained`] methods.
When you use [`PreTrainedModel.save_pretrained`], it automatically calls [`PretrainedConfig.save_pretrained`] so that both the model and configuration are saved together. When you use [`PreTrainedModel.save_pretrained`], it automatically calls [`PreTrainedConfig.save_pretrained`] so that both the model and configuration are saved together.
A model is saved to a `model.safetensors` file and a configuration is saved to a `config.json` file. A model is saved to a `model.safetensors` file and a configuration is saved to a `config.json` file.

View File

@ -22,7 +22,7 @@ Higher-level computer visions tasks, such as object detection or image segmentat
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Backbone.png"/> <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Backbone.png"/>
</div> </div>
Load a backbone with [`~PretrainedConfig.from_pretrained`] and use the `out_indices` parameter to determine which layer, given by the index, to extract a feature map from. Load a backbone with [`~PreTrainedConfig.from_pretrained`] and use the `out_indices` parameter to determine which layer, given by the index, to extract a feature map from.
```py ```py
from transformers import AutoBackbone from transformers import AutoBackbone
@ -46,7 +46,7 @@ There are two ways to load a Transformers backbone, [`AutoBackbone`] and a model
<hfoptions id="backbone-classes"> <hfoptions id="backbone-classes">
<hfoption id="AutoBackbone"> <hfoption id="AutoBackbone">
The [AutoClass](./model_doc/auto) API automatically loads a pretrained vision model with [`~PretrainedConfig.from_pretrained`] as a backbone if it's supported. The [AutoClass](./model_doc/auto) API automatically loads a pretrained vision model with [`~PreTrainedConfig.from_pretrained`] as a backbone if it's supported.
Set the `out_indices` parameter to the layer you'd like to get the feature map from. If you know the name of the layer, you could also use `out_features`. These parameters can be used interchangeably, but if you use both, make sure they refer to the same layer. Set the `out_indices` parameter to the layer you'd like to get the feature map from. If you know the name of the layer, you could also use `out_features`. These parameters can be used interchangeably, but if you use both, make sure they refer to the same layer.

View File

@ -25,12 +25,12 @@ This guide will show you how to customize a ResNet model, enable [AutoClass](./m
## Configuration ## Configuration
A configuration, given by the base [`PretrainedConfig`] class, contains all the necessary information to build a model. This is where you'll configure the attributes of the custom ResNet model. Different attributes gives different ResNet model types. A configuration, given by the base [`PreTrainedConfig`] class, contains all the necessary information to build a model. This is where you'll configure the attributes of the custom ResNet model. Different attributes gives different ResNet model types.
The main rules for customizing a configuration are: The main rules for customizing a configuration are:
1. A custom configuration must subclass [`PretrainedConfig`]. This ensures a custom model has all the functionality of a Transformers' model such as [`~PretrainedConfig.from_pretrained`], [`~PretrainedConfig.save_pretrained`], and [`~PretrainedConfig.push_to_hub`]. 1. A custom configuration must subclass [`PreTrainedConfig`]. This ensures a custom model has all the functionality of a Transformers' model such as [`~PreTrainedConfig.from_pretrained`], [`~PreTrainedConfig.save_pretrained`], and [`~PreTrainedConfig.push_to_hub`].
2. The [`PretrainedConfig`] `__init__` must accept any `kwargs` and they must be passed to the superclass `__init__`. [`PretrainedConfig`] has more fields than the ones set in your custom configuration, so when you load a configuration with [`~PretrainedConfig.from_pretrained`], those fields need to be accepted by your configuration and passed to the superclass. 2. The [`PreTrainedConfig`] `__init__` must accept any `kwargs` and they must be passed to the superclass `__init__`. [`PreTrainedConfig`] has more fields than the ones set in your custom configuration, so when you load a configuration with [`~PreTrainedConfig.from_pretrained`], those fields need to be accepted by your configuration and passed to the superclass.
> [!TIP] > [!TIP]
> It is useful to check the validity of some of the parameters. In the example below, a check is implemented to ensure `block_type` and `stem_type` belong to one of the predefined values. > It is useful to check the validity of some of the parameters. In the example below, a check is implemented to ensure `block_type` and `stem_type` belong to one of the predefined values.
@ -38,10 +38,10 @@ The main rules for customizing a configuration are:
> Add `model_type` to the configuration class to enable [AutoClass](./models#autoclass) support. > Add `model_type` to the configuration class to enable [AutoClass](./models#autoclass) support.
```py ```py
from transformers import PretrainedConfig from transformers import PreTrainedConfig
from typing import List from typing import List
class ResnetConfig(PretrainedConfig): class ResnetConfig(PreTrainedConfig):
model_type = "resnet" model_type = "resnet"
def __init__( def __init__(
@ -74,7 +74,7 @@ class ResnetConfig(PretrainedConfig):
super().__init__(**kwargs) super().__init__(**kwargs)
``` ```
Save the configuration to a JSON file in your custom model folder, `custom-resnet`, with [`~PretrainedConfig.save_pretrained`]. Save the configuration to a JSON file in your custom model folder, `custom-resnet`, with [`~PreTrainedConfig.save_pretrained`].
```py ```py
resnet50d_config = ResnetConfig(block_type="bottleneck", stem_width=32, stem_type="deep", avg_down=True) resnet50d_config = ResnetConfig(block_type="bottleneck", stem_width=32, stem_type="deep", avg_down=True)
@ -83,7 +83,7 @@ resnet50d_config.save_pretrained("custom-resnet")
## Model ## Model
With the custom ResNet configuration, you can now create and customize the model. The model subclasses the base [`PreTrainedModel`] class. Like [`PretrainedConfig`], inheriting from [`PreTrainedModel`] and initializing the superclass with the configuration extends Transformers' functionalities such as saving and loading to the custom model. With the custom ResNet configuration, you can now create and customize the model. The model subclasses the base [`PreTrainedModel`] class. Like [`PreTrainedConfig`], inheriting from [`PreTrainedModel`] and initializing the superclass with the configuration extends Transformers' functionalities such as saving and loading to the custom model.
Transformers' models follow the convention of accepting a `config` object in the `__init__` method. This passes the entire `config` to the model sublayers, instead of breaking the `config` object into multiple arguments that are individually passed to the sublayers. Transformers' models follow the convention of accepting a `config` object in the `__init__` method. This passes the entire `config` to the model sublayers, instead of breaking the `config` object into multiple arguments that are individually passed to the sublayers.
@ -235,7 +235,7 @@ from resnet_model.configuration_resnet import ResnetConfig
from resnet_model.modeling_resnet import ResnetModel, ResnetModelForImageClassification from resnet_model.modeling_resnet import ResnetModel, ResnetModelForImageClassification
``` ```
Copy the code from the model and configuration files. To make sure the AutoClass objects are saved with [`~PreTrainedModel.save_pretrained`], call the [`~PretrainedConfig.register_for_auto_class`] method. This modifies the configuration JSON file to include the AutoClass objects and mapping. Copy the code from the model and configuration files. To make sure the AutoClass objects are saved with [`~PreTrainedModel.save_pretrained`], call the [`~PreTrainedConfig.register_for_auto_class`] method. This modifies the configuration JSON file to include the AutoClass objects and mapping.
For a model, pick the appropriate `AutoModelFor` class based on the task. For a model, pick the appropriate `AutoModelFor` class based on the task.

View File

@ -16,7 +16,7 @@ rendered properly in your Markdown viewer.
# Configuration # Configuration
The base class [`PretrainedConfig`] implements the common methods for loading/saving a configuration The base class [`PreTrainedConfig`] implements the common methods for loading/saving a configuration
either from a local file or directory, or from a pretrained model configuration provided by the library (downloaded either from a local file or directory, or from a pretrained model configuration provided by the library (downloaded
from HuggingFace's AWS S3 repository). from HuggingFace's AWS S3 repository).
@ -24,8 +24,8 @@ Each derived config class implements model specific attributes. Common attribute
`hidden_size`, `num_attention_heads`, and `num_hidden_layers`. Text models further implement: `hidden_size`, `num_attention_heads`, and `num_hidden_layers`. Text models further implement:
`vocab_size`. `vocab_size`.
## PretrainedConfig ## PreTrainedConfig
[[autodoc]] PretrainedConfig [[autodoc]] PreTrainedConfig
- push_to_hub - push_to_hub
- all - all

View File

@ -48,7 +48,7 @@ You will then be able to use the auto classes like you would usually do!
<Tip warning={true}> <Tip warning={true}>
If your `NewModelConfig` is a subclass of [`~transformers.PretrainedConfig`], make sure its If your `NewModelConfig` is a subclass of [`~transformers.PreTrainedConfig`], make sure its
`model_type` attribute is set to the same key you use when registering the config (here `"new-model"`). `model_type` attribute is set to the same key you use when registering the config (here `"new-model"`).
Likewise, if your `NewModel` is a subclass of [`PreTrainedModel`], make sure its Likewise, if your `NewModel` is a subclass of [`PreTrainedModel`], make sure its

View File

@ -73,7 +73,7 @@ Each pretrained model inherits from three base classes.
| **Class** | **Description** | | **Class** | **Description** |
|---|---| |---|---|
| [`PretrainedConfig`] | A file that specifies a models attributes such as the number of attention heads or vocabulary size. | | [`PreTrainedConfig`] | A file that specifies a models attributes such as the number of attention heads or vocabulary size. |
| [`PreTrainedModel`] | A model (or architecture) defined by the model attributes from the configuration file. A pretrained model only returns the raw hidden states. For a specific task, use the appropriate model head to convert the raw hidden states into a meaningful result (for example, [`LlamaModel`] versus [`LlamaForCausalLM`]). | | [`PreTrainedModel`] | A model (or architecture) defined by the model attributes from the configuration file. A pretrained model only returns the raw hidden states. For a specific task, use the appropriate model head to convert the raw hidden states into a meaningful result (for example, [`LlamaModel`] versus [`LlamaForCausalLM`]). |
| Preprocessor | A class for converting raw inputs (text, images, audio, multimodal) into numerical inputs to the model. For example, [`PreTrainedTokenizer`] converts text into tensors and [`ImageProcessingMixin`] converts pixels into tensors. | | Preprocessor | A class for converting raw inputs (text, images, audio, multimodal) into numerical inputs to the model. For example, [`PreTrainedTokenizer`] converts text into tensors and [`ImageProcessingMixin`] converts pixels into tensors. |

View File

@ -21,7 +21,7 @@ rendered properly in your Markdown viewer.
Transformers can export a model to TorchScript by: Transformers can export a model to TorchScript by:
1. creating dummy inputs to create a *trace* of the model to serialize to TorchScript 1. creating dummy inputs to create a *trace* of the model to serialize to TorchScript
2. enabling the `torchscript` parameter in either [`~PretrainedConfig.torchscript`] for a randomly initialized model or [`~PreTrainedModel.from_pretrained`] for a pretrained model 2. enabling the `torchscript` parameter in either [`~PreTrainedConfig.torchscript`] for a randomly initialized model or [`~PreTrainedModel.from_pretrained`] for a pretrained model
## Dummy inputs ## Dummy inputs

View File

@ -135,9 +135,9 @@ class MyModel(PreTrainedModel):
```python ```python
from transformers import PretrainedConfig from transformers import PreTrainedConfig
class MyConfig(PretrainedConfig): class MyConfig(PreTrainedConfig):
base_model_tp_plan = { base_model_tp_plan = {
"layers.*.self_attn.k_proj": "colwise", "layers.*.self_attn.k_proj": "colwise",
"layers.*.self_attn.v_proj": "colwise", "layers.*.self_attn.v_proj": "colwise",

View File

@ -83,19 +83,19 @@ DistilBertConfig {
} }
``` ```
Los atributos de los modelos preentrenados pueden ser modificados con la función [`~PretrainedConfig.from_pretrained`]: Los atributos de los modelos preentrenados pueden ser modificados con la función [`~PreTrainedConfig.from_pretrained`]:
```py ```py
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4) >>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
``` ```
Cuando estés satisfecho con la configuración de tu modelo, puedes guardarlo con la función [`~PretrainedConfig.save_pretrained`]. Tu configuración se guardará en un archivo JSON dentro del directorio que le especifiques como parámetro. Cuando estés satisfecho con la configuración de tu modelo, puedes guardarlo con la función [`~PreTrainedConfig.save_pretrained`]. Tu configuración se guardará en un archivo JSON dentro del directorio que le especifiques como parámetro.
```py ```py
>>> my_config.save_pretrained(save_directory="./your_model_save_path") >>> my_config.save_pretrained(save_directory="./your_model_save_path")
``` ```
Para volver a usar el archivo de configuración, puedes cargarlo usando [`~PretrainedConfig.from_pretrained`]: Para volver a usar el archivo de configuración, puedes cargarlo usando [`~PreTrainedConfig.from_pretrained`]:
```py ```py
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json") >>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json")

View File

@ -38,11 +38,11 @@ configuraciones nos darán los diferentes tipos de ResNet que son posibles. Lueg
después de verificar la validez de algunos de ellos. después de verificar la validez de algunos de ellos.
```python ```python
from transformers import PretrainedConfig from transformers import PreTrainedConfig
from typing import List from typing import List
class ResnetConfig(PretrainedConfig): class ResnetConfig(PreTrainedConfig):
model_type = "resnet" model_type = "resnet"
def __init__( def __init__(
@ -76,12 +76,12 @@ class ResnetConfig(PretrainedConfig):
``` ```
Las tres cosas importantes que debes recordar al escribir tu propia configuración son las siguientes: Las tres cosas importantes que debes recordar al escribir tu propia configuración son las siguientes:
- tienes que heredar de `PretrainedConfig`, - tienes que heredar de `PreTrainedConfig`,
- el `__init__` de tu `PretrainedConfig` debe aceptar cualquier `kwargs`, - el `__init__` de tu `PreTrainedConfig` debe aceptar cualquier `kwargs`,
- esos `kwargs` deben pasarse a la superclase `__init__`. - esos `kwargs` deben pasarse a la superclase `__init__`.
La herencia es para asegurarte de obtener toda la funcionalidad de la biblioteca 🤗 Transformers, mientras que las otras dos La herencia es para asegurarte de obtener toda la funcionalidad de la biblioteca 🤗 Transformers, mientras que las otras dos
restricciones provienen del hecho de que una `PretrainedConfig` tiene más campos que los que estás configurando. Al recargar una restricciones provienen del hecho de que una `PreTrainedConfig` tiene más campos que los que estás configurando. Al recargar una
`config` con el método `from_pretrained`, esos campos deben ser aceptados por tu `config` y luego enviados a la superclase. `config` con el método `from_pretrained`, esos campos deben ser aceptados por tu `config` y luego enviados a la superclase.
Definir un `model_type` para tu configuración (en este caso `model_type="resnet"`) no es obligatorio, a menos que quieras Definir un `model_type` para tu configuración (en este caso `model_type="resnet"`) no es obligatorio, a menos que quieras
@ -102,7 +102,7 @@ con el método `from_pretrained`:
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet") resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
``` ```
También puedes usar cualquier otro método de la clase [`PretrainedConfig`], como [`~PretrainedConfig.push_to_hub`], para cargar También puedes usar cualquier otro método de la clase [`PreTrainedConfig`], como [`~PreTrainedConfig.push_to_hub`], para cargar
directamente tu configuración en el Hub. directamente tu configuración en el Hub.
## Escribir un modelo personalizado ## Escribir un modelo personalizado

View File

@ -71,7 +71,7 @@ Pour les tâches de vision, un processeur d'image traite l'image pour la formate
<figcaption class="mt-2 text-center text-sm text-gray-500">Un backbone Swin avec plusieurs étapes pour produire une carte de caractéristiques.</figcaption> <figcaption class="mt-2 text-center text-sm text-gray-500">Un backbone Swin avec plusieurs étapes pour produire une carte de caractéristiques.</figcaption>
</div> </div>
[`AutoBackbone`] vous permet d'utiliser des modèles pré-entraînés comme backbones pour obtenir des cartes de caractéristiques à partir de différentes étapes du backbone. Vous devez spécifier l'un des paramètres suivants dans [`~PretrainedConfig.from_pretrained`] : [`AutoBackbone`] vous permet d'utiliser des modèles pré-entraînés comme backbones pour obtenir des cartes de caractéristiques à partir de différentes étapes du backbone. Vous devez spécifier l'un des paramètres suivants dans [`~PreTrainedConfig.from_pretrained`] :
* `out_indices` est l'index de la couche dont vous souhaitez obtenir la carte de caractéristiques * `out_indices` est l'index de la couche dont vous souhaitez obtenir la carte de caractéristiques
* `out_features` est le nom de la couche dont vous souhaitez obtenir la carte de caractéristiques * `out_features` est le nom de la couche dont vous souhaitez obtenir la carte de caractéristiques

View File

@ -67,7 +67,7 @@ Tenendo questi principi in mente, immergiamoci nel design generale della libreri
### Panoramica sui modelli ### Panoramica sui modelli
Per aggiungere con successo un modello, é importante capire l'interazione tra il tuo modello e la sua configurazione, Per aggiungere con successo un modello, é importante capire l'interazione tra il tuo modello e la sua configurazione,
[`PreTrainedModel`], e [`PretrainedConfig`]. Per dare un esempio, chiameremo il modello da aggiungere a 🤗 Transformers [`PreTrainedModel`], e [`PreTrainedConfig`]. Per dare un esempio, chiameremo il modello da aggiungere a 🤗 Transformers
`BrandNewBert`. `BrandNewBert`.
Diamo un'occhiata: Diamo un'occhiata:
@ -94,9 +94,9 @@ model.config # il modello ha accesso al suo config
``` ```
Analogamente al modello, la configurazione eredita le funzionalità base di serializzazione e deserializzazione da Analogamente al modello, la configurazione eredita le funzionalità base di serializzazione e deserializzazione da
[`PretrainedConfig`]. É da notare che la configurazione e il modello sono sempre serializzati in due formati differenti - [`PreTrainedConfig`]. É da notare che la configurazione e il modello sono sempre serializzati in due formati differenti -
il modello é serializzato in un file *pytorch_model.bin* mentre la configurazione con *config.json*. Chiamando il modello é serializzato in un file *pytorch_model.bin* mentre la configurazione con *config.json*. Chiamando
[`~PreTrainedModel.save_pretrained`] automaticamente chiamerà [`~PretrainedConfig.save_pretrained`], cosicché sia il [`~PreTrainedModel.save_pretrained`] automaticamente chiamerà [`~PreTrainedConfig.save_pretrained`], cosicché sia il
modello che la configurazione siano salvati. modello che la configurazione siano salvati.

View File

@ -83,19 +83,19 @@ DistilBertConfig {
} }
``` ```
Nella funzione [`~PretrainedConfig.from_pretrained`] possono essere modificati gli attributi del modello pre-allenato: Nella funzione [`~PreTrainedConfig.from_pretrained`] possono essere modificati gli attributi del modello pre-allenato:
```py ```py
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4) >>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
``` ```
Quando la configurazione del modello ti soddisfa, la puoi salvare con [`~PretrainedConfig.save_pretrained`]. Il file della tua configurazione è memorizzato come file JSON nella save directory specificata: Quando la configurazione del modello ti soddisfa, la puoi salvare con [`~PreTrainedConfig.save_pretrained`]. Il file della tua configurazione è memorizzato come file JSON nella save directory specificata:
```py ```py
>>> my_config.save_pretrained(save_directory="./your_model_save_path") >>> my_config.save_pretrained(save_directory="./your_model_save_path")
``` ```
Per riutilizzare la configurazione del file, caricalo con [`~PretrainedConfig.from_pretrained`]: Per riutilizzare la configurazione del file, caricalo con [`~PreTrainedConfig.from_pretrained`]:
```py ```py
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json") >>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json")

View File

@ -37,11 +37,11 @@ Configurazioni differenti ci daranno quindi i differenti possibili tipi di ResNe
dopo averne controllato la validità. dopo averne controllato la validità.
```python ```python
from transformers import PretrainedConfig from transformers import PreTrainedConfig
from typing import List from typing import List
class ResnetConfig(PretrainedConfig): class ResnetConfig(PreTrainedConfig):
model_type = "resnet" model_type = "resnet"
def __init__( def __init__(
@ -75,12 +75,12 @@ class ResnetConfig(PretrainedConfig):
``` ```
Le tre cose più importanti da ricordare quando scrivi le tue configurazioni sono le seguenti: Le tre cose più importanti da ricordare quando scrivi le tue configurazioni sono le seguenti:
- Devi ereditare da `Pretrainedconfig`, - Devi ereditare da `PreTrainedConfig`,
- Il metodo `__init__` del tuo `Pretrainedconfig` deve accettare i kwargs, - Il metodo `__init__` del tuo `PreTrainedConfig` deve accettare i kwargs,
- I `kwargs` devono essere passati alla superclass `__init__` - I `kwargs` devono essere passati alla superclass `__init__`
Leredità è importante per assicurarsi di ottenere tutte le funzionalità della libreria 🤗 transformers, Leredità è importante per assicurarsi di ottenere tutte le funzionalità della libreria 🤗 transformers,
mentre gli altri due vincoli derivano dal fatto che un `Pretrainedconfig` ha più campi di quelli che stai settando. mentre gli altri due vincoli derivano dal fatto che un `PreTrainedConfig` ha più campi di quelli che stai settando.
Quando ricarichi una config da un metodo `from_pretrained`, questi campi devono essere accettati dalla tua config e Quando ricarichi una config da un metodo `from_pretrained`, questi campi devono essere accettati dalla tua config e
poi inviati alla superclasse. poi inviati alla superclasse.
@ -102,7 +102,7 @@ config con il metodo `from_pretrained`.
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet") resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
``` ```
Puoi anche usare qualunque altro metodo della classe [`PretrainedConfig`], come [`~PretrainedConfig.push_to_hub`] Puoi anche usare qualunque altro metodo della classe [`PreTrainedConfig`], come [`~PreTrainedConfig.push_to_hub`]
per caricare direttamente la tua configurazione nell'hub. per caricare direttamente la tua configurazione nell'hub.
## Scrivere un modello personalizzato ## Scrivere un modello personalizzato

View File

@ -51,7 +51,7 @@ Hugging Faceチームのメンバーがサポートを提供するので、一
### Overview of models ### Overview of models
モデルを正常に追加するためには、モデルとその設定、[`PreTrainedModel`]、および[`PretrainedConfig`]の相互作用を理解することが重要です。 モデルを正常に追加するためには、モデルとその設定、[`PreTrainedModel`]、および[`PreTrainedConfig`]の相互作用を理解することが重要です。
例示的な目的で、🤗 Transformersに追加するモデルを「BrandNewBert」と呼びます。 例示的な目的で、🤗 Transformersに追加するモデルを「BrandNewBert」と呼びます。
以下をご覧ください: 以下をご覧ください:
@ -77,7 +77,7 @@ model = BrandNewBertModel.from_pretrained("brandy/brand_new_bert")
model.config # model has access to its config model.config # model has access to its config
``` ```
モデルと同様に、設定は[`PretrainedConfig`]から基本的なシリアル化および逆シリアル化の機能を継承しています。注意すべきは、設定とモデルは常に2つの異なる形式にシリアル化されることです - モデルは*pytorch_model.bin*ファイルに、設定は*config.json*ファイルにシリアル化されます。[`~PreTrainedModel.save_pretrained`]を呼び出すと、自動的に[`~PretrainedConfig.save_pretrained`]も呼び出され、モデルと設定の両方が保存されます。 モデルと同様に、設定は[`PreTrainedConfig`]から基本的なシリアル化および逆シリアル化の機能を継承しています。注意すべきは、設定とモデルは常に2つの異なる形式にシリアル化されることです - モデルは*pytorch_model.bin*ファイルに、設定は*config.json*ファイルにシリアル化されます。[`~PreTrainedModel.save_pretrained`]を呼び出すと、自動的に[`~PreTrainedConfig.save_pretrained`]も呼び出され、モデルと設定の両方が保存されます。
### Code style ### Code style

View File

@ -86,19 +86,19 @@ DistilBertConfig {
} }
``` ```
事前学習済みモデルの属性は、[`~PretrainedConfig.from_pretrained`] 関数で変更できます: 事前学習済みモデルの属性は、[`~PreTrainedConfig.from_pretrained`] 関数で変更できます:
```py ```py
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4) >>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
``` ```
Once you are satisfied with your model configuration, you can save it with [`PretrainedConfig.save_pretrained`]. Your configuration file is stored as a JSON file in the specified save directory. Once you are satisfied with your model configuration, you can save it with [`PreTrainedConfig.save_pretrained`]. Your configuration file is stored as a JSON file in the specified save directory.
```py ```py
>>> my_config.save_pretrained(save_directory="./your_model_save_path") >>> my_config.save_pretrained(save_directory="./your_model_save_path")
``` ```
設定ファイルを再利用するには、[`~PretrainedConfig.from_pretrained`]を使用してそれをロードします: 設定ファイルを再利用するには、[`~PreTrainedConfig.from_pretrained`]を使用してそれをロードします:
```py ```py
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json") >>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")

View File

@ -29,11 +29,11 @@ rendered properly in your Markdown viewer.
この例では、ResNetクラスのいくつかの引数を取得し、調整したいかもしれないとします。異なる設定は、異なるタイプのResNetを提供します。その後、これらの引数を確認した後、それらの引数を単に格納します。 この例では、ResNetクラスのいくつかの引数を取得し、調整したいかもしれないとします。異なる設定は、異なるタイプのResNetを提供します。その後、これらの引数を確認した後、それらの引数を単に格納します。
```python ```python
from transformers import PretrainedConfig from transformers import PreTrainedConfig
from typing import List from typing import List
class ResnetConfig(PretrainedConfig): class ResnetConfig(PreTrainedConfig):
model_type = "resnet" model_type = "resnet"
def __init__( def __init__(
@ -67,12 +67,12 @@ class ResnetConfig(PretrainedConfig):
``` ```
重要なことを3つ覚えておくべきポイントは次のとおりです 重要なことを3つ覚えておくべきポイントは次のとおりです
- `PretrainedConfig` を継承する必要があります。 - `PreTrainedConfig` を継承する必要があります。
- あなたの `PretrainedConfig``__init__` は任意の kwargs を受け入れる必要があります。 - あなたの `PreTrainedConfig``__init__` は任意の kwargs を受け入れる必要があります。
- これらの `kwargs` は親クラスの `__init__` に渡す必要があります。 - これらの `kwargs` は親クラスの `__init__` に渡す必要があります。
継承は、🤗 Transformers ライブラリのすべての機能を取得できるようにするためです。他の2つの制約は、 継承は、🤗 Transformers ライブラリのすべての機能を取得できるようにするためです。他の2つの制約は、
`PretrainedConfig` が設定しているフィールド以外にも多くのフィールドを持っていることから来ています。 `PreTrainedConfig` が設定しているフィールド以外にも多くのフィールドを持っていることから来ています。
`from_pretrained` メソッドで設定を再ロードする場合、これらのフィールドはあなたの設定に受け入れられ、 `from_pretrained` メソッドで設定を再ロードする場合、これらのフィールドはあなたの設定に受け入れられ、
その後、親クラスに送信される必要があります。 その後、親クラスに送信される必要があります。
@ -95,7 +95,7 @@ resnet50d_config.save_pretrained("custom-resnet")
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet") resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
``` ```
また、[`PretrainedConfig`] クラスの他のメソッドを使用することもできます。たとえば、[`~PretrainedConfig.push_to_hub`] を使用して、設定を直接 Hub にアップロードできます。 また、[`PreTrainedConfig`] クラスの他のメソッドを使用することもできます。たとえば、[`~PreTrainedConfig.push_to_hub`] を使用して、設定を直接 Hub にアップロードできます。
## Writing a custom model ## Writing a custom model

View File

@ -16,7 +16,7 @@ rendered properly in your Markdown viewer.
構成 構成
基本クラス [`PretrainedConfig`] は、設定をロード/保存するための一般的なメソッドを実装します。 基本クラス [`PreTrainedConfig`] は、設定をロード/保存するための一般的なメソッドを実装します。
ローカル ファイルまたはディレクトリから、またはライブラリ (ダウンロードされた) によって提供される事前トレーニング済みモデル構成から ローカル ファイルまたはディレクトリから、またはライブラリ (ダウンロードされた) によって提供される事前トレーニング済みモデル構成から
HuggingFace の AWS S3 リポジトリから)。 HuggingFace の AWS S3 リポジトリから)。
@ -24,8 +24,8 @@ HuggingFace の AWS S3 リポジトリから)。
`hidden_size``num_attention_heads`、および `num_hidden_layers`。テキスト モデルはさらに以下を実装します。 `hidden_size``num_attention_heads`、および `num_hidden_layers`。テキスト モデルはさらに以下を実装します。
`vocab_size` `vocab_size`
## PretrainedConfig ## PreTrainedConfig
[[autodoc]] PretrainedConfig [[autodoc]] PreTrainedConfig
- push_to_hub - push_to_hub
- all - all

View File

@ -43,7 +43,7 @@ AutoModel.register(NewModelConfig, NewModel)
<Tip warning={true}> <Tip warning={true}>
あなたの`NewModelConfig`が[`~transformers.PretrainedConfig`]のサブクラスである場合、その`model_type`属性がコンフィグを登録するときに使用するキー(ここでは`"new-model"`)と同じに設定されていることを確認してください。 あなたの`NewModelConfig`が[`~transformers.PreTrainedConfig`]のサブクラスである場合、その`model_type`属性がコンフィグを登録するときに使用するキー(ここでは`"new-model"`)と同じに設定されていることを確認してください。
同様に、あなたの`NewModel`が[`PreTrainedModel`]のサブクラスである場合、その`config_class`属性がモデルを登録する際に使用するクラス(ここでは`NewModelConfig`)と同じに設定されていることを確認してください。 同様に、あなたの`NewModel`が[`PreTrainedModel`]のサブクラスである場合、その`config_class`属性がモデルを登録する際に使用するクラス(ここでは`NewModelConfig`)と同じに設定されていることを確認してください。

View File

@ -46,7 +46,7 @@ Hugging Face 팀은 항상 도움을 줄 준비가 되어 있으므로 혼자가
### 모델 개요 [[overview-of-models]] ### 모델 개요 [[overview-of-models]]
모델을 성공적으로 추가하려면 모델과 해당 구성인 [`PreTrainedModel`] 및 [`PretrainedConfig`] 간의 상호작용을 이해하는 것이 중요합니다. 예를 들어, 🤗 Transformers에 추가하려는 모델을 `BrandNewBert`라고 부르겠습니다. 모델을 성공적으로 추가하려면 모델과 해당 구성인 [`PreTrainedModel`] 및 [`PreTrainedConfig`] 간의 상호작용을 이해하는 것이 중요합니다. 예를 들어, 🤗 Transformers에 추가하려는 모델을 `BrandNewBert`라고 부르겠습니다.
다음을 살펴보겠습니다: 다음을 살펴보겠습니다:
@ -59,7 +59,7 @@ model = BrandNewBertModel.from_pretrained("brandy/brand_new_bert")
model.config # model has access to its config model.config # model has access to its config
``` ```
모델과 마찬가지로 구성은 [`PretrainedConfig`]에서 기본 직렬화 및 역직렬화 기능을 상속받습니다. 구성과 모델은 항상 *pytorch_model.bin* 파일과 *config.json* 파일로 각각 별도로 직렬화됩니다. [`~PreTrainedModel.save_pretrained`]를 호출하면 자동으로 [`~PretrainedConfig.save_pretrained`]도 호출되므로 모델과 구성이 모두 저장됩니다. 모델과 마찬가지로 구성은 [`PreTrainedConfig`]에서 기본 직렬화 및 역직렬화 기능을 상속받습니다. 구성과 모델은 항상 *pytorch_model.bin* 파일과 *config.json* 파일로 각각 별도로 직렬화됩니다. [`~PreTrainedModel.save_pretrained`]를 호출하면 자동으로 [`~PreTrainedConfig.save_pretrained`]도 호출되므로 모델과 구성이 모두 저장됩니다.
### 코드 스타일 [[code-style]] ### 코드 스타일 [[code-style]]

View File

@ -36,11 +36,11 @@ rendered properly in your Markdown viewer.
그런 다음 몇 가지 유효성을 확인한 후 해당 인수를 저장합니다. 그런 다음 몇 가지 유효성을 확인한 후 해당 인수를 저장합니다.
```python ```python
from transformers import PretrainedConfig from transformers import PreTrainedConfig
from typing import List from typing import List
class ResnetConfig(PretrainedConfig): class ResnetConfig(PreTrainedConfig):
model_type = "resnet" model_type = "resnet"
def __init__( def __init__(
@ -74,12 +74,12 @@ class ResnetConfig(PretrainedConfig):
``` ```
사용자 정의 `configuration`을 작성할 때 기억해야 할 세 가지 중요한 사항은 다음과 같습니다: 사용자 정의 `configuration`을 작성할 때 기억해야 할 세 가지 중요한 사항은 다음과 같습니다:
- `PretrainedConfig`을 상속해야 합니다. - `PreTrainedConfig`을 상속해야 합니다.
- `PretrainedConfig``__init__`은 모든 kwargs를 허용해야 하고, - `PreTrainedConfig``__init__`은 모든 kwargs를 허용해야 하고,
- 이러한 `kwargs`는 상위 클래스 `__init__`에 전달되어야 합니다. - 이러한 `kwargs`는 상위 클래스 `__init__`에 전달되어야 합니다.
상속은 🤗 Transformers 라이브러리에서 모든 기능을 가져오는 것입니다. 상속은 🤗 Transformers 라이브러리에서 모든 기능을 가져오는 것입니다.
이러한 점으로부터 비롯되는 두 가지 제약 조건은 `PretrainedConfig`에 설정하는 것보다 더 많은 필드가 있습니다. 이러한 점으로부터 비롯되는 두 가지 제약 조건은 `PreTrainedConfig`에 설정하는 것보다 더 많은 필드가 있습니다.
`from_pretrained` 메서드로 구성을 다시 로드할 때 해당 필드는 구성에서 수락한 후 상위 클래스로 보내야 합니다. `from_pretrained` 메서드로 구성을 다시 로드할 때 해당 필드는 구성에서 수락한 후 상위 클래스로 보내야 합니다.
모델을 auto 클래스에 등록하지 않는 한, `configuration`에서 `model_type`을 정의(여기서 `model_type="resnet"`)하는 것은 필수 사항이 아닙니다 (마지막 섹션 참조). 모델을 auto 클래스에 등록하지 않는 한, `configuration`에서 `model_type`을 정의(여기서 `model_type="resnet"`)하는 것은 필수 사항이 아닙니다 (마지막 섹션 참조).
@ -99,7 +99,7 @@ resnet50d_config.save_pretrained("custom-resnet")
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet") resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
``` ```
구성을 Hub에 직접 업로드하기 위해 [`PretrainedConfig`] 클래스의 [`~PretrainedConfig.push_to_hub`]와 같은 다른 메서드를 사용할 수 있습니다. 구성을 Hub에 직접 업로드하기 위해 [`PreTrainedConfig`] 클래스의 [`~PreTrainedConfig.push_to_hub`]와 같은 다른 메서드를 사용할 수 있습니다.
## 사용자 정의 모델 작성하기[[writing-a-custom-model]] ## 사용자 정의 모델 작성하기[[writing-a-custom-model]]

View File

@ -16,13 +16,13 @@ rendered properly in your Markdown viewer.
# 구성[[configuration]] # 구성[[configuration]]
기본 클래스 [`PretrainedConfig`]는 로컬 파일이나 디렉토리, 또는 라이브러리에서 제공하는 사전 학습된 모델 구성(HuggingFace의 AWS S3 저장소에서 다운로드됨)으로부터 구성을 불러오거나 저장하는 공통 메서드를 구현합니다. 각 파생 구성 클래스는 모델별 특성을 구현합니다. 기본 클래스 [`PreTrainedConfig`]는 로컬 파일이나 디렉토리, 또는 라이브러리에서 제공하는 사전 학습된 모델 구성(HuggingFace의 AWS S3 저장소에서 다운로드됨)으로부터 구성을 불러오거나 저장하는 공통 메서드를 구현합니다. 각 파생 구성 클래스는 모델별 특성을 구현합니다.
모든 구성 클래스에 존재하는 공통 속성은 다음과 같습니다: `hidden_size`, `num_attention_heads`, `num_hidden_layers`. 텍스트 모델은 추가로 `vocab_size`를 구현합니다. 모든 구성 클래스에 존재하는 공통 속성은 다음과 같습니다: `hidden_size`, `num_attention_heads`, `num_hidden_layers`. 텍스트 모델은 추가로 `vocab_size`를 구현합니다.
## PretrainedConfig[[transformers.PretrainedConfig]] ## PreTrainedConfig[[transformers.PreTrainedConfig]]
[[autodoc]] PretrainedConfig [[autodoc]] PreTrainedConfig
- push_to_hub - push_to_hub
- all - all

View File

@ -44,7 +44,7 @@ AutoModel.register(NewModelConfig, NewModel)
<Tip warning={true}> <Tip warning={true}>
만약 `NewModelConfig`가 [`~transformers.PretrainedConfig`]의 서브클래스라면, 해당 `model_type` 속성이 등록할 때 사용하는 키(여기서는 `"new-model"`)와 동일하게 설정되어 있는지 확인하세요. 만약 `NewModelConfig`가 [`~transformers.PreTrainedConfig`]의 서브클래스라면, 해당 `model_type` 속성이 등록할 때 사용하는 키(여기서는 `"new-model"`)와 동일하게 설정되어 있는지 확인하세요.
마찬가지로, `NewModel`이 [`PreTrainedModel`]의 서브클래스라면, 해당 `config_class` 속성이 등록할 때 사용하는 클래스(여기서는 `NewModelConfig`)와 동일하게 설정되어 있는지 확인하세요. 마찬가지로, `NewModel`이 [`PreTrainedModel`]의 서브클래스라면, 해당 `config_class` 속성이 등록할 때 사용하는 클래스(여기서는 `NewModelConfig`)와 동일하게 설정되어 있는지 확인하세요.

View File

@ -83,19 +83,19 @@ DistilBertConfig {
} }
``` ```
Atributos de um modelo pré-treinado podem ser modificados na função [`~PretrainedConfig.from_pretrained`]: Atributos de um modelo pré-treinado podem ser modificados na função [`~PreTrainedConfig.from_pretrained`]:
```py ```py
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4) >>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
``` ```
Uma vez que você está satisfeito com as configurações do seu modelo, você consegue salvar elas com [`~PretrainedConfig.save_pretrained`]. Seu arquivo de configurações está salvo como um arquivo JSON no diretório especificado: Uma vez que você está satisfeito com as configurações do seu modelo, você consegue salvar elas com [`~PreTrainedConfig.save_pretrained`]. Seu arquivo de configurações está salvo como um arquivo JSON no diretório especificado:
```py ```py
>>> my_config.save_pretrained(save_directory="./your_model_save_path") >>> my_config.save_pretrained(save_directory="./your_model_save_path")
``` ```
Para reusar o arquivo de configurações, carregue com [`~PretrainedConfig.from_pretrained`]: Para reusar o arquivo de configurações, carregue com [`~PreTrainedConfig.from_pretrained`]:
```py ```py
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json") >>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json")

View File

@ -37,11 +37,11 @@ configurações nos dará os diferentes tipos de ResNets que são possíveis. Em
após verificar a validade de alguns deles. após verificar a validade de alguns deles.
```python ```python
from transformers import PretrainedConfig from transformers import PreTrainedConfig
from typing import List from typing import List
class ResnetConfig(PretrainedConfig): class ResnetConfig(PreTrainedConfig):
model_type = "resnet" model_type = "resnet"
def __init__( def __init__(
@ -75,12 +75,12 @@ class ResnetConfig(PretrainedConfig):
``` ```
As três coisas importantes a serem lembradas ao escrever sua própria configuração são: As três coisas importantes a serem lembradas ao escrever sua própria configuração são:
- você tem que herdar de `PretrainedConfig`, - você tem que herdar de `PreTrainedConfig`,
- o `__init__` do seu `PretrainedConfig` deve aceitar quaisquer kwargs, - o `__init__` do seu `PreTrainedConfig` deve aceitar quaisquer kwargs,
- esses `kwargs` precisam ser passados para a superclasse `__init__`. - esses `kwargs` precisam ser passados para a superclasse `__init__`.
A herança é para garantir que você obtenha todas as funcionalidades da biblioteca 🤗 Transformers, enquanto as outras duas A herança é para garantir que você obtenha todas as funcionalidades da biblioteca 🤗 Transformers, enquanto as outras duas
restrições vêm do fato de um `PretrainedConfig` ter mais campos do que os que você está configurando. Ao recarregar um restrições vêm do fato de um `PreTrainedConfig` ter mais campos do que os que você está configurando. Ao recarregar um
config com o método `from_pretrained`, esses campos precisam ser aceitos pelo seu config e então enviados para a config com o método `from_pretrained`, esses campos precisam ser aceitos pelo seu config e então enviados para a
superclasse. superclasse.
@ -102,7 +102,7 @@ método `from_pretrained`:
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet") resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
``` ```
Você também pode usar qualquer outro método da classe [`PretrainedConfig`], como [`~PretrainedConfig.push_to_hub`] para Você também pode usar qualquer outro método da classe [`PreTrainedConfig`], como [`~PreTrainedConfig.push_to_hub`] para
carregar diretamente sua configuração para o Hub. carregar diretamente sua configuração para o Hub.
## Escrevendo um modelo customizado ## Escrevendo um modelo customizado

View File

@ -84,19 +84,19 @@ DistilBertConfig {
} }
``` ```
预训练模型的属性可以在 [`~PretrainedConfig.from_pretrained`] 函数中进行修改: 预训练模型的属性可以在 [`~PreTrainedConfig.from_pretrained`] 函数中进行修改:
```py ```py
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4) >>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
``` ```
当你对模型配置满意时,可以使用 [`~PretrainedConfig.save_pretrained`] 来保存配置。你的配置文件将以 JSON 文件的形式存储在指定的保存目录中: 当你对模型配置满意时,可以使用 [`~PreTrainedConfig.save_pretrained`] 来保存配置。你的配置文件将以 JSON 文件的形式存储在指定的保存目录中:
```py ```py
>>> my_config.save_pretrained(save_directory="./your_model_save_path") >>> my_config.save_pretrained(save_directory="./your_model_save_path")
``` ```
要重用配置文件,请使用 [`~PretrainedConfig.from_pretrained`] 进行加载: 要重用配置文件,请使用 [`~PreTrainedConfig.from_pretrained`] 进行加载:
```py ```py
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json") >>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")

View File

@ -29,11 +29,11 @@ rendered properly in your Markdown viewer.
我们将采用一些我们可能想要调整的 ResNet 类的参数举例。不同的配置将为我们提供不同类型可能的 ResNet 模型。在确认其中一些参数的有效性后,我们只需存储这些参数。 我们将采用一些我们可能想要调整的 ResNet 类的参数举例。不同的配置将为我们提供不同类型可能的 ResNet 模型。在确认其中一些参数的有效性后,我们只需存储这些参数。
```python ```python
from transformers import PretrainedConfig from transformers import PreTrainedConfig
from typing import List from typing import List
class ResnetConfig(PretrainedConfig): class ResnetConfig(PreTrainedConfig):
model_type = "resnet" model_type = "resnet"
def __init__( def __init__(
@ -67,11 +67,11 @@ class ResnetConfig(PretrainedConfig):
``` ```
编写自定义配置时需要记住的三个重要事项如下: 编写自定义配置时需要记住的三个重要事项如下:
- 必须继承自 `PretrainedConfig` - 必须继承自 `PreTrainedConfig`
- `PretrainedConfig``__init__` 方法必须接受任何 kwargs - `PreTrainedConfig``__init__` 方法必须接受任何 kwargs
- 这些 `kwargs` 需要传递给超类的 `__init__` 方法。 - 这些 `kwargs` 需要传递给超类的 `__init__` 方法。
继承是为了确保你获得来自 🤗 Transformers 库的所有功能,而另外两个约束源于 `PretrainedConfig` 的字段比你设置的字段多。在使用 `from_pretrained` 方法重新加载配置时,这些字段需要被你的配置接受,然后传递给超类。 继承是为了确保你获得来自 🤗 Transformers 库的所有功能,而另外两个约束源于 `PreTrainedConfig` 的字段比你设置的字段多。在使用 `from_pretrained` 方法重新加载配置时,这些字段需要被你的配置接受,然后传递给超类。
为你的配置定义 `model_type`(此处为 `model_type="resnet"`)不是必须的,除非你想使用自动类注册你的模型(请参阅最后一节)。 为你的配置定义 `model_type`(此处为 `model_type="resnet"`)不是必须的,除非你想使用自动类注册你的模型(请参阅最后一节)。
@ -88,7 +88,7 @@ resnet50d_config.save_pretrained("custom-resnet")
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet") resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
``` ```
你还可以使用 [`PretrainedConfig`] 类的任何其他方法,例如 [`~PretrainedConfig.push_to_hub`],直接将配置上传到 Hub。 你还可以使用 [`PreTrainedConfig`] 类的任何其他方法,例如 [`~PreTrainedConfig.push_to_hub`],直接将配置上传到 Hub。
## 编写自定义模型 ## 编写自定义模型

View File

@ -16,13 +16,13 @@ rendered properly in your Markdown viewer.
# Configuration # Configuration
基类[`PretrainedConfig`]实现了从本地文件或目录加载/保存配置的常见方法或下载库提供的预训练模型配置从HuggingFace的AWS S3库中下载 基类[`PreTrainedConfig`]实现了从本地文件或目录加载/保存配置的常见方法或下载库提供的预训练模型配置从HuggingFace的AWS S3库中下载
每个派生的配置类都实现了特定于模型的属性。所有配置类中共同存在的属性有:`hidden_size``num_attention_heads``num_hidden_layers`。文本模型进一步添加了 `vocab_size` 每个派生的配置类都实现了特定于模型的属性。所有配置类中共同存在的属性有:`hidden_size``num_attention_heads``num_hidden_layers`。文本模型进一步添加了 `vocab_size`
## PretrainedConfig ## PreTrainedConfig
[[autodoc]] PretrainedConfig [[autodoc]] PreTrainedConfig
- push_to_hub - push_to_hub
- all - all

View File

@ -17,7 +17,7 @@ from transformers import (
AutoModelForTokenClassification, AutoModelForTokenClassification,
AutoModelWithLMHead, AutoModelWithLMHead,
AutoTokenizer, AutoTokenizer,
PretrainedConfig, PreTrainedConfig,
PreTrainedTokenizer, PreTrainedTokenizer,
is_torch_available, is_torch_available,
) )
@ -93,7 +93,7 @@ class BaseTransformer(pl.LightningModule):
**config_kwargs, **config_kwargs,
) )
else: else:
self.config: PretrainedConfig = config self.config: PreTrainedConfig = config
extra_model_params = ("encoder_layerdrop", "decoder_layerdrop", "dropout", "attention_dropout") extra_model_params = ("encoder_layerdrop", "decoder_layerdrop", "dropout", "attention_dropout")
for p in extra_model_params: for p in extra_model_params:

View File

@ -5,19 +5,19 @@
# modular_duplicated_method.py file directly. One of our CI enforces this. # modular_duplicated_method.py file directly. One of our CI enforces this.
# 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 # 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...modeling_rope_utils import rope_config_validation from ...modeling_rope_utils import rope_config_validation
class DuplicatedMethodConfig(PretrainedConfig): class DuplicatedMethodConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`DuplicatedMethodModel`]. It is used to instantiate an DuplicatedMethod This is the configuration class to store the configuration of a [`DuplicatedMethodModel`]. It is used to instantiate an DuplicatedMethod
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
defaults will yield a similar configuration to that of the DuplicatedMethod-7B. defaults will yield a similar configuration to that of the DuplicatedMethod-7B.
e.g. [meta-duplicated_method/DuplicatedMethod-2-7b-hf](https://huggingface.co/meta-duplicated_method/DuplicatedMethod-2-7b-hf) e.g. [meta-duplicated_method/DuplicatedMethod-2-7b-hf](https://huggingface.co/meta-duplicated_method/DuplicatedMethod-2-7b-hf)
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -5,19 +5,19 @@
# modular_my_new_model.py file directly. One of our CI enforces this. # modular_my_new_model.py file directly. One of our CI enforces this.
# 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 # 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...modeling_rope_utils import rope_config_validation from ...modeling_rope_utils import rope_config_validation
class MyNewModelConfig(PretrainedConfig): class MyNewModelConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`MyNewModelModel`]. It is used to instantiate an MyNewModel This is the configuration class to store the configuration of a [`MyNewModelModel`]. It is used to instantiate an MyNewModel
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
defaults will yield a similar configuration to that of the MyNewModel-7B. defaults will yield a similar configuration to that of the MyNewModel-7B.
e.g. [meta-my_new_model/MyNewModel-2-7b-hf](https://huggingface.co/meta-my_new_model/MyNewModel-2-7b-hf) e.g. [meta-my_new_model/MyNewModel-2-7b-hf](https://huggingface.co/meta-my_new_model/MyNewModel-2-7b-hf)
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -5,18 +5,18 @@
# modular_my_new_model2.py file directly. One of our CI enforces this. # modular_my_new_model2.py file directly. One of our CI enforces this.
# 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 # 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...modeling_rope_utils import rope_config_validation from ...modeling_rope_utils import rope_config_validation
class MyNewModel2Config(PretrainedConfig): class MyNewModel2Config(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`GemmaModel`]. It is used to instantiate an Gemma This is the configuration class to store the configuration of a [`GemmaModel`]. It is used to instantiate an Gemma
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
defaults will yield a similar configuration to that of the Gemma-7B. defaults will yield a similar configuration to that of the Gemma-7B.
e.g. [google/gemma-7b](https://huggingface.co/google/gemma-7b) e.g. [google/gemma-7b](https://huggingface.co/google/gemma-7b)
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 256000): vocab_size (`int`, *optional*, defaults to 256000):
Vocabulary size of the Gemma model. Defines the number of different tokens that can be represented by the Vocabulary size of the Gemma model. Defines the number of different tokens that can be represented by the

View File

@ -6,17 +6,17 @@
# 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 # 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
# Example where we only want to overwrite the defaults of an init # Example where we only want to overwrite the defaults of an init
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
class NewModelConfig(PretrainedConfig): class NewModelConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`NewModelModel`]. It is used to instantiate an NewModel This is the configuration class to store the configuration of a [`NewModelModel`]. It is used to instantiate an NewModel
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
defaults will yield a similar configuration to that of the NewModel-7B. defaults will yield a similar configuration to that of the NewModel-7B.
e.g. [google/new_model-7b](https://huggingface.co/google/new_model-7b) e.g. [google/new_model-7b](https://huggingface.co/google/new_model-7b)
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 256000): vocab_size (`int`, *optional*, defaults to 256000):
Vocabulary size of the NewModel model. Defines the number of different tokens that can be represented by the Vocabulary size of the NewModel model. Defines the number of different tokens that can be represented by the

View File

@ -9,8 +9,8 @@ class MyNewModelConfig(LlamaConfig):
defaults will yield a similar configuration to that of the MyNewModel-7B. defaults will yield a similar configuration to that of the MyNewModel-7B.
e.g. [meta-my_new_model/MyNewModel-2-7b-hf](https://huggingface.co/meta-my_new_model/MyNewModel-2-7b-hf) e.g. [meta-my_new_model/MyNewModel-2-7b-hf](https://huggingface.co/meta-my_new_model/MyNewModel-2-7b-hf)
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -9,8 +9,8 @@ class MyNewModel2Config(LlamaConfig):
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
defaults will yield a similar configuration to that of the Gemma-7B. defaults will yield a similar configuration to that of the Gemma-7B.
e.g. [google/gemma-7b](https://huggingface.co/google/gemma-7b) e.g. [google/gemma-7b](https://huggingface.co/google/gemma-7b)
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 256000): vocab_size (`int`, *optional*, defaults to 256000):
Vocabulary size of the Gemma model. Defines the number of different tokens that can be represented by the Vocabulary size of the Gemma model. Defines the number of different tokens that can be represented by the

View File

@ -51,7 +51,7 @@ from transformers import (
DataCollatorWithPadding, DataCollatorWithPadding,
EvalPrediction, EvalPrediction,
HfArgumentParser, HfArgumentParser,
PretrainedConfig, PreTrainedConfig,
Trainer, Trainer,
TrainingArguments, TrainingArguments,
default_data_collator, default_data_collator,
@ -429,7 +429,7 @@ def main():
# Some models have set the order of the labels to use, so let's make sure we do use it. # Some models have set the order of the labels to use, so let's make sure we do use it.
label_to_id = None label_to_id = None
if ( if (
model.config.label2id != PretrainedConfig(num_labels=num_labels).label2id model.config.label2id != PreTrainedConfig(num_labels=num_labels).label2id
and data_args.task_name is not None and data_args.task_name is not None
and not is_regression and not is_regression
): ):

View File

@ -53,7 +53,7 @@ from transformers import (
AutoModelForSequenceClassification, AutoModelForSequenceClassification,
AutoTokenizer, AutoTokenizer,
DataCollatorWithPadding, DataCollatorWithPadding,
PretrainedConfig, PreTrainedConfig,
SchedulerType, SchedulerType,
default_data_collator, default_data_collator,
get_scheduler, get_scheduler,
@ -367,7 +367,7 @@ def main():
# Some models have set the order of the labels to use, so let's make sure we do use it. # Some models have set the order of the labels to use, so let's make sure we do use it.
label_to_id = None label_to_id = None
if ( if (
model.config.label2id != PretrainedConfig(num_labels=num_labels).label2id model.config.label2id != PreTrainedConfig(num_labels=num_labels).label2id
and args.task_name is not None and args.task_name is not None
and not is_regression and not is_regression
): ):

View File

@ -48,7 +48,7 @@ from transformers import (
AutoTokenizer, AutoTokenizer,
DataCollatorForTokenClassification, DataCollatorForTokenClassification,
HfArgumentParser, HfArgumentParser,
PretrainedConfig, PreTrainedConfig,
PreTrainedTokenizerFast, PreTrainedTokenizerFast,
Trainer, Trainer,
TrainingArguments, TrainingArguments,
@ -413,7 +413,7 @@ def main():
) )
# Model has labels -> use them. # Model has labels -> use them.
if model.config.label2id != PretrainedConfig(num_labels=num_labels).label2id: if model.config.label2id != PreTrainedConfig(num_labels=num_labels).label2id:
if sorted(model.config.label2id.keys()) == sorted(label_list): if sorted(model.config.label2id.keys()) == sorted(label_list):
# Reorganize `label_list` to match the ordering of the model. # Reorganize `label_list` to match the ordering of the model.
if labels_are_int: if labels_are_int:

View File

@ -57,7 +57,7 @@ from transformers import (
AutoModelForTokenClassification, AutoModelForTokenClassification,
AutoTokenizer, AutoTokenizer,
DataCollatorForTokenClassification, DataCollatorForTokenClassification,
PretrainedConfig, PreTrainedConfig,
SchedulerType, SchedulerType,
default_data_collator, default_data_collator,
get_scheduler, get_scheduler,
@ -454,7 +454,7 @@ def main():
model.resize_token_embeddings(len(tokenizer)) model.resize_token_embeddings(len(tokenizer))
# Model has labels -> use them. # Model has labels -> use them.
if model.config.label2id != PretrainedConfig(num_labels=num_labels).label2id: if model.config.label2id != PreTrainedConfig(num_labels=num_labels).label2id:
if sorted(model.config.label2id.keys()) == sorted(label_list): if sorted(model.config.label2id.keys()) == sorted(label_list):
# Reorganize `label_list` to match the ordering of the model. # Reorganize `label_list` to match the ordering of the model.
if labels_are_int: if labels_are_int:

View File

@ -59,7 +59,7 @@ logger = logging.get_logger(__name__) # pylint: disable=invalid-name
_import_structure = { _import_structure = {
"audio_utils": [], "audio_utils": [],
"commands": [], "commands": [],
"configuration_utils": ["PretrainedConfig"], "configuration_utils": ["PreTrainedConfig", "PretrainedConfig"],
"convert_slow_tokenizers_checkpoints_to_fast": [], "convert_slow_tokenizers_checkpoints_to_fast": [],
"data": [ "data": [
"DataProcessor", "DataProcessor",
@ -491,6 +491,7 @@ if TYPE_CHECKING:
from .cache_utils import StaticCache as StaticCache from .cache_utils import StaticCache as StaticCache
from .cache_utils import StaticLayer as StaticLayer from .cache_utils import StaticLayer as StaticLayer
from .cache_utils import StaticSlidingWindowLayer as StaticSlidingWindowLayer from .cache_utils import StaticSlidingWindowLayer as StaticSlidingWindowLayer
from .configuration_utils import PreTrainedConfig as PreTrainedConfig
from .configuration_utils import PretrainedConfig as PretrainedConfig from .configuration_utils import PretrainedConfig as PretrainedConfig
from .convert_slow_tokenizer import SLOW_TO_FAST_CONVERTERS as SLOW_TO_FAST_CONVERTERS from .convert_slow_tokenizer import SLOW_TO_FAST_CONVERTERS as SLOW_TO_FAST_CONVERTERS
from .convert_slow_tokenizer import convert_slow_tokenizer as convert_slow_tokenizer from .convert_slow_tokenizer import convert_slow_tokenizer as convert_slow_tokenizer

View File

@ -4,7 +4,7 @@ from typing import Any, Optional
import torch import torch
from .configuration_utils import PretrainedConfig from .configuration_utils import PreTrainedConfig
from .utils import ( from .utils import (
is_hqq_available, is_hqq_available,
is_quanto_greater, is_quanto_greater,
@ -923,7 +923,7 @@ class DynamicCache(Cache):
`map(gather_map, zip(*caches))`, i.e. each item in the iterable contains the key and value states `map(gather_map, zip(*caches))`, i.e. each item in the iterable contains the key and value states
for a layer gathered across replicas by torch.distributed (shape=[global batch size, num_heads, seq_len, head_dim]). for a layer gathered across replicas by torch.distributed (shape=[global batch size, num_heads, seq_len, head_dim]).
Note: it needs to be the 1st arg as well to work correctly Note: it needs to be the 1st arg as well to work correctly
config (`PretrainedConfig`, *optional*): config (`PreTrainedConfig`, *optional*):
The config of the model for which this Cache will be used. If passed, it will be used to check for sliding The config of the model for which this Cache will be used. If passed, it will be used to check for sliding
or hybrid layer structure, greatly reducing the memory requirement of the cached tensors to or hybrid layer structure, greatly reducing the memory requirement of the cached tensors to
`[batch_size, num_heads, min(seq_len, sliding_window), head_dim]`. `[batch_size, num_heads, min(seq_len, sliding_window), head_dim]`.
@ -953,7 +953,7 @@ class DynamicCache(Cache):
def __init__( def __init__(
self, self,
ddp_cache_data: Optional[Iterable[tuple[torch.Tensor, torch.Tensor]]] = None, ddp_cache_data: Optional[Iterable[tuple[torch.Tensor, torch.Tensor]]] = None,
config: Optional[PretrainedConfig] = None, config: Optional[PreTrainedConfig] = None,
offloading: bool = False, offloading: bool = False,
offload_only_non_sliding: bool = False, offload_only_non_sliding: bool = False,
): ):
@ -1036,7 +1036,7 @@ class StaticCache(Cache):
See `Cache` for details on common methods that are implemented by all cache classes. See `Cache` for details on common methods that are implemented by all cache classes.
Args: Args:
config (`PretrainedConfig`): config (`PreTrainedConfig`):
The config of the model for which this Cache will be used. It will be used to check for sliding The config of the model for which this Cache will be used. It will be used to check for sliding
or hybrid layer structure, and initialize each layer accordingly. or hybrid layer structure, and initialize each layer accordingly.
max_cache_len (`int`): max_cache_len (`int`):
@ -1070,7 +1070,7 @@ class StaticCache(Cache):
# Pass-in kwargs as well to avoid crashing for BC (it used more arguments before) # Pass-in kwargs as well to avoid crashing for BC (it used more arguments before)
def __init__( def __init__(
self, self,
config: PretrainedConfig, config: PreTrainedConfig,
max_cache_len: int, max_cache_len: int,
offloading: bool = False, offloading: bool = False,
offload_only_non_sliding: bool = True, offload_only_non_sliding: bool = True,
@ -1124,7 +1124,7 @@ class QuantizedCache(Cache):
Args: Args:
backend (`str`): backend (`str`):
The quantization backend to use. One of `("quanto", "hqq"). The quantization backend to use. One of `("quanto", "hqq").
config (`PretrainedConfig`): config (`PreTrainedConfig`):
The config of the model for which this Cache will be used. The config of the model for which this Cache will be used.
nbits (`int`, *optional*, defaults to 4): nbits (`int`, *optional*, defaults to 4):
The number of bits for quantization. The number of bits for quantization.
@ -1141,7 +1141,7 @@ class QuantizedCache(Cache):
def __init__( def __init__(
self, self,
backend: str, backend: str,
config: PretrainedConfig, config: PreTrainedConfig,
nbits: int = 4, nbits: int = 4,
axis_key: int = 0, axis_key: int = 0,
axis_value: int = 0, axis_value: int = 0,
@ -1400,7 +1400,7 @@ class OffloadedCache(DynamicCache):
class OffloadedStaticCache(StaticCache): class OffloadedStaticCache(StaticCache):
def __init__(self, config: PretrainedConfig, max_cache_len: int, *args, **kwargs): def __init__(self, config: PreTrainedConfig, max_cache_len: int, *args, **kwargs):
logger.warning_once( logger.warning_once(
"`OffloadedStaticCache` is deprecated and will be removed in version v4.59 " "`OffloadedStaticCache` is deprecated and will be removed in version v4.59 "
"Use `StaticCache(..., offloading=True)` instead" "Use `StaticCache(..., offloading=True)` instead"
@ -1409,7 +1409,7 @@ class OffloadedStaticCache(StaticCache):
class SlidingWindowCache(StaticCache): class SlidingWindowCache(StaticCache):
def __init__(self, config: PretrainedConfig, max_cache_len: int, *args, **kwargs): def __init__(self, config: PreTrainedConfig, max_cache_len: int, *args, **kwargs):
logger.warning_once( logger.warning_once(
"`SlidingWindowCache` is deprecated and will be removed in version v4.59 " "`SlidingWindowCache` is deprecated and will be removed in version v4.59 "
"Use `StaticCache(...)` instead which will correctly infer the type of each layer." "Use `StaticCache(...)` instead which will correctly infer the type of each layer."
@ -1418,7 +1418,7 @@ class SlidingWindowCache(StaticCache):
class HybridCache(StaticCache): class HybridCache(StaticCache):
def __init__(self, config: PretrainedConfig, max_cache_len: int, *args, **kwargs): def __init__(self, config: PreTrainedConfig, max_cache_len: int, *args, **kwargs):
logger.warning_once( logger.warning_once(
"`HybridCache` is deprecated and will be removed in version v4.59 " "`HybridCache` is deprecated and will be removed in version v4.59 "
"Use `StaticCache(...)` instead which will correctly infer the type of each layer." "Use `StaticCache(...)` instead which will correctly infer the type of each layer."
@ -1427,7 +1427,7 @@ class HybridCache(StaticCache):
class HybridChunkedCache(StaticCache): class HybridChunkedCache(StaticCache):
def __init__(self, config: PretrainedConfig, max_cache_len: int, *args, **kwargs): def __init__(self, config: PreTrainedConfig, max_cache_len: int, *args, **kwargs):
logger.warning_once( logger.warning_once(
"`HybridChunkedCache` is deprecated and will be removed in version v4.59 " "`HybridChunkedCache` is deprecated and will be removed in version v4.59 "
"Use `StaticCache(...)` instead which will correctly infer the type of each layer." "Use `StaticCache(...)` instead which will correctly infer the type of each layer."
@ -1436,7 +1436,7 @@ class HybridChunkedCache(StaticCache):
class OffloadedHybridCache(StaticCache): class OffloadedHybridCache(StaticCache):
def __init__(self, config: PretrainedConfig, max_cache_len: int, *args, **kwargs): def __init__(self, config: PreTrainedConfig, max_cache_len: int, *args, **kwargs):
logger.warning_once( logger.warning_once(
"`OffloadedHybridCache` is deprecated and will be removed in version v4.59 " "`OffloadedHybridCache` is deprecated and will be removed in version v4.59 "
"Use `StaticCache(..., offload=True)` instead which will correctly infer the type of each layer." "Use `StaticCache(..., offload=True)` instead which will correctly infer the type of each layer."
@ -1447,7 +1447,7 @@ class OffloadedHybridCache(StaticCache):
class QuantoQuantizedCache(QuantizedCache): class QuantoQuantizedCache(QuantizedCache):
def __init__( def __init__(
self, self,
config: PretrainedConfig, config: PreTrainedConfig,
nbits: int = 4, nbits: int = 4,
axis_key: int = 0, axis_key: int = 0,
axis_value: int = 0, axis_value: int = 0,
@ -1464,7 +1464,7 @@ class QuantoQuantizedCache(QuantizedCache):
class HQQQuantizedCache(QuantizedCache): class HQQQuantizedCache(QuantizedCache):
def __init__( def __init__(
self, self,
config: PretrainedConfig, config: PreTrainedConfig,
nbits: int = 4, nbits: int = 4,
axis_key: int = 0, axis_key: int = 0,
axis_value: int = 0, axis_value: int = 0,

View File

@ -46,11 +46,11 @@ if TYPE_CHECKING:
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
# type hinting: specifying the type of config class that inherits from PretrainedConfig # type hinting: specifying the type of config class that inherits from PreTrainedConfig
SpecificPretrainedConfigType = TypeVar("SpecificPretrainedConfigType", bound="PretrainedConfig") SpecificPreTrainedConfigType = TypeVar("SpecificPreTrainedConfigType", bound="PreTrainedConfig")
class PretrainedConfig(PushToHubMixin): class PreTrainedConfig(PushToHubMixin):
# no-format # no-format
r""" r"""
Base class for all configuration classes. Handles a few parameters common to all models' configurations as well as Base class for all configuration classes. Handles a few parameters common to all models' configurations as well as
@ -70,7 +70,7 @@ class PretrainedConfig(PushToHubMixin):
- **has_no_defaults_at_init** (`bool`) -- Whether the config class can be initialized without providing input arguments. - **has_no_defaults_at_init** (`bool`) -- Whether the config class can be initialized without providing input arguments.
Some configurations requires inputs to be defined at init and have no default values, usually these are composite configs, Some configurations requires inputs to be defined at init and have no default values, usually these are composite configs,
(but not necessarily) such as [`~transformers.EncoderDecoderConfig`] or [`~RagConfig`]. They have to be initialized from (but not necessarily) such as [`~transformers.EncoderDecoderConfig`] or [`~RagConfig`]. They have to be initialized from
two or more configs of type [`~transformers.PretrainedConfig`]. two or more configs of type [`~transformers.PreTrainedConfig`].
- **keys_to_ignore_at_inference** (`list[str]`) -- A list of keys to ignore by default when looking at dictionary - **keys_to_ignore_at_inference** (`list[str]`) -- A list of keys to ignore by default when looking at dictionary
outputs of the model during inference. outputs of the model during inference.
- **attribute_map** (`dict[str, str]`) -- A dict that maps model specific attribute names to the standardized - **attribute_map** (`dict[str, str]`) -- A dict that maps model specific attribute names to the standardized
@ -186,7 +186,7 @@ class PretrainedConfig(PushToHubMixin):
model_type: str = "" model_type: str = ""
base_config_key: str = "" base_config_key: str = ""
sub_configs: dict[str, type["PretrainedConfig"]] = {} sub_configs: dict[str, type["PreTrainedConfig"]] = {}
has_no_defaults_at_init: bool = False has_no_defaults_at_init: bool = False
attribute_map: dict[str, str] = {} attribute_map: dict[str, str] = {}
base_model_tp_plan: Optional[dict[str, Any]] = None base_model_tp_plan: Optional[dict[str, Any]] = None
@ -432,7 +432,7 @@ class PretrainedConfig(PushToHubMixin):
def save_pretrained(self, save_directory: Union[str, os.PathLike], push_to_hub: bool = False, **kwargs): def save_pretrained(self, save_directory: Union[str, os.PathLike], push_to_hub: bool = False, **kwargs):
""" """
Save a configuration object to the directory `save_directory`, so that it can be re-loaded using the Save a configuration object to the directory `save_directory`, so that it can be re-loaded using the
[`~PretrainedConfig.from_pretrained`] class method. [`~PreTrainedConfig.from_pretrained`] class method.
Args: Args:
save_directory (`str` or `os.PathLike`): save_directory (`str` or `os.PathLike`):
@ -522,7 +522,7 @@ class PretrainedConfig(PushToHubMixin):
@classmethod @classmethod
def from_pretrained( def from_pretrained(
cls: type[SpecificPretrainedConfigType], cls: type[SpecificPreTrainedConfigType],
pretrained_model_name_or_path: Union[str, os.PathLike], pretrained_model_name_or_path: Union[str, os.PathLike],
cache_dir: Optional[Union[str, os.PathLike]] = None, cache_dir: Optional[Union[str, os.PathLike]] = None,
force_download: bool = False, force_download: bool = False,
@ -530,9 +530,9 @@ class PretrainedConfig(PushToHubMixin):
token: Optional[Union[str, bool]] = None, token: Optional[Union[str, bool]] = None,
revision: str = "main", revision: str = "main",
**kwargs, **kwargs,
) -> SpecificPretrainedConfigType: ) -> SpecificPreTrainedConfigType:
r""" r"""
Instantiate a [`PretrainedConfig`] (or a derived class) from a pretrained model configuration. Instantiate a [`PreTrainedConfig`] (or a derived class) from a pretrained model configuration.
Args: Args:
pretrained_model_name_or_path (`str` or `os.PathLike`): pretrained_model_name_or_path (`str` or `os.PathLike`):
@ -541,7 +541,7 @@ class PretrainedConfig(PushToHubMixin):
- a string, the *model id* of a pretrained model configuration hosted inside a model repo on - a string, the *model id* of a pretrained model configuration hosted inside a model repo on
huggingface.co. huggingface.co.
- a path to a *directory* containing a configuration file saved using the - a path to a *directory* containing a configuration file saved using the
[`~PretrainedConfig.save_pretrained`] method, e.g., `./my_model_directory/`. [`~PreTrainedConfig.save_pretrained`] method, e.g., `./my_model_directory/`.
- a path or url to a saved configuration JSON *file*, e.g., `./my_model_directory/configuration.json`. - a path or url to a saved configuration JSON *file*, e.g., `./my_model_directory/configuration.json`.
cache_dir (`str` or `os.PathLike`, *optional*): cache_dir (`str` or `os.PathLike`, *optional*):
Path to a directory in which a downloaded pretrained model configuration should be cached if the Path to a directory in which a downloaded pretrained model configuration should be cached if the
@ -581,12 +581,12 @@ class PretrainedConfig(PushToHubMixin):
by the `return_unused_kwargs` keyword parameter. by the `return_unused_kwargs` keyword parameter.
Returns: Returns:
[`PretrainedConfig`]: The configuration object instantiated from this pretrained model. [`PreTrainedConfig`]: The configuration object instantiated from this pretrained model.
Examples: Examples:
```python ```python
# We can't instantiate directly the base class *PretrainedConfig* so let's show the examples on a # We can't instantiate directly the base class *PreTrainedConfig* so let's show the examples on a
# derived class: BertConfig # derived class: BertConfig
config = BertConfig.from_pretrained( config = BertConfig.from_pretrained(
"google-bert/bert-base-uncased" "google-bert/bert-base-uncased"
@ -636,7 +636,7 @@ class PretrainedConfig(PushToHubMixin):
) -> tuple[dict[str, Any], dict[str, Any]]: ) -> tuple[dict[str, Any], dict[str, Any]]:
""" """
From a `pretrained_model_name_or_path`, resolve to a dictionary of parameters, to be used for instantiating a From a `pretrained_model_name_or_path`, resolve to a dictionary of parameters, to be used for instantiating a
[`PretrainedConfig`] using `from_dict`. [`PreTrainedConfig`] using `from_dict`.
Parameters: Parameters:
pretrained_model_name_or_path (`str` or `os.PathLike`): pretrained_model_name_or_path (`str` or `os.PathLike`):
@ -761,20 +761,20 @@ class PretrainedConfig(PushToHubMixin):
@classmethod @classmethod
def from_dict( def from_dict(
cls: type[SpecificPretrainedConfigType], config_dict: dict[str, Any], **kwargs cls: type[SpecificPreTrainedConfigType], config_dict: dict[str, Any], **kwargs
) -> SpecificPretrainedConfigType: ) -> SpecificPreTrainedConfigType:
""" """
Instantiates a [`PretrainedConfig`] from a Python dictionary of parameters. Instantiates a [`PreTrainedConfig`] from a Python dictionary of parameters.
Args: Args:
config_dict (`dict[str, Any]`): config_dict (`dict[str, Any]`):
Dictionary that will be used to instantiate the configuration object. Such a dictionary can be Dictionary that will be used to instantiate the configuration object. Such a dictionary can be
retrieved from a pretrained checkpoint by leveraging the [`~PretrainedConfig.get_config_dict`] method. retrieved from a pretrained checkpoint by leveraging the [`~PreTrainedConfig.get_config_dict`] method.
kwargs (`dict[str, Any]`): kwargs (`dict[str, Any]`):
Additional parameters from which to initialize the configuration object. Additional parameters from which to initialize the configuration object.
Returns: Returns:
[`PretrainedConfig`]: The configuration object instantiated from those parameters. [`PreTrainedConfig`]: The configuration object instantiated from those parameters.
""" """
return_unused_kwargs = kwargs.pop("return_unused_kwargs", False) return_unused_kwargs = kwargs.pop("return_unused_kwargs", False)
# Those arguments may be passed along for our internal telemetry. # Those arguments may be passed along for our internal telemetry.
@ -815,7 +815,7 @@ class PretrainedConfig(PushToHubMixin):
current_attr = getattr(config, key) current_attr = getattr(config, key)
# To authorize passing a custom subconfig as kwarg in models that have nested configs. # To authorize passing a custom subconfig as kwarg in models that have nested configs.
# We need to update only custom kwarg values instead and keep other attributes in subconfig. # We need to update only custom kwarg values instead and keep other attributes in subconfig.
if isinstance(current_attr, PretrainedConfig) and isinstance(value, dict): if isinstance(current_attr, PreTrainedConfig) and isinstance(value, dict):
current_attr_updated = current_attr.to_dict() current_attr_updated = current_attr.to_dict()
current_attr_updated.update(value) current_attr_updated.update(value)
value = current_attr.__class__(**current_attr_updated) value = current_attr.__class__(**current_attr_updated)
@ -833,17 +833,17 @@ class PretrainedConfig(PushToHubMixin):
@classmethod @classmethod
def from_json_file( def from_json_file(
cls: type[SpecificPretrainedConfigType], json_file: Union[str, os.PathLike] cls: type[SpecificPreTrainedConfigType], json_file: Union[str, os.PathLike]
) -> SpecificPretrainedConfigType: ) -> SpecificPreTrainedConfigType:
""" """
Instantiates a [`PretrainedConfig`] from the path to a JSON file of parameters. Instantiates a [`PreTrainedConfig`] from the path to a JSON file of parameters.
Args: Args:
json_file (`str` or `os.PathLike`): json_file (`str` or `os.PathLike`):
Path to the JSON file containing the parameters. Path to the JSON file containing the parameters.
Returns: Returns:
[`PretrainedConfig`]: The configuration object instantiated from that JSON file. [`PreTrainedConfig`]: The configuration object instantiated from that JSON file.
""" """
config_dict = cls._dict_from_json_file(json_file) config_dict = cls._dict_from_json_file(json_file)
@ -856,7 +856,7 @@ class PretrainedConfig(PushToHubMixin):
return json.loads(text) return json.loads(text)
def __eq__(self, other): def __eq__(self, other):
return isinstance(other, PretrainedConfig) and (self.__dict__ == other.__dict__) return isinstance(other, PreTrainedConfig) and (self.__dict__ == other.__dict__)
def __repr__(self): def __repr__(self):
return f"{self.__class__.__name__} {self.to_json_string()}" return f"{self.__class__.__name__} {self.to_json_string()}"
@ -876,7 +876,7 @@ class PretrainedConfig(PushToHubMixin):
config_dict = self.to_dict() config_dict = self.to_dict()
# Get the default config dict (from a fresh PreTrainedConfig instance) # Get the default config dict (from a fresh PreTrainedConfig instance)
default_config_dict = PretrainedConfig().to_dict() default_config_dict = PreTrainedConfig().to_dict()
# get class specific config dict # get class specific config dict
class_config_dict = self.__class__().to_dict() if not self.has_no_defaults_at_init else {} class_config_dict = self.__class__().to_dict() if not self.has_no_defaults_at_init else {}
@ -887,7 +887,7 @@ class PretrainedConfig(PushToHubMixin):
# except always keep the 'config' attribute. # except always keep the 'config' attribute.
for key, value in config_dict.items(): for key, value in config_dict.items():
if ( if (
isinstance(getattr(self, key, None), PretrainedConfig) isinstance(getattr(self, key, None), PreTrainedConfig)
and key in class_config_dict and key in class_config_dict
and isinstance(class_config_dict[key], dict) and isinstance(class_config_dict[key], dict)
or key in self.sub_configs or key in self.sub_configs
@ -940,7 +940,7 @@ class PretrainedConfig(PushToHubMixin):
for key, value in output.items(): for key, value in output.items():
# Deal with nested configs like CLIP # Deal with nested configs like CLIP
if isinstance(value, PretrainedConfig): if isinstance(value, PreTrainedConfig):
value = value.to_dict() value = value.to_dict()
del value["transformers_version"] del value["transformers_version"]
@ -964,7 +964,7 @@ class PretrainedConfig(PushToHubMixin):
Args: Args:
use_diff (`bool`, *optional*, defaults to `True`): use_diff (`bool`, *optional*, defaults to `True`):
If set to `True`, only the difference between the config instance and the default `PretrainedConfig()` If set to `True`, only the difference between the config instance and the default `PreTrainedConfig()`
is serialized to JSON string. is serialized to JSON string.
Returns: Returns:
@ -984,7 +984,7 @@ class PretrainedConfig(PushToHubMixin):
json_file_path (`str` or `os.PathLike`): json_file_path (`str` or `os.PathLike`):
Path to the JSON file in which this configuration instance's parameters will be saved. Path to the JSON file in which this configuration instance's parameters will be saved.
use_diff (`bool`, *optional*, defaults to `True`): use_diff (`bool`, *optional*, defaults to `True`):
If set to `True`, only the difference between the config instance and the default `PretrainedConfig()` If set to `True`, only the difference between the config instance and the default `PreTrainedConfig()`
is serialized to JSON file. is serialized to JSON file.
""" """
with open(json_file_path, "w", encoding="utf-8") as writer: with open(json_file_path, "w", encoding="utf-8") as writer:
@ -1137,7 +1137,7 @@ class PretrainedConfig(PushToHubMixin):
def _get_non_default_generation_parameters(self) -> dict[str, Any]: def _get_non_default_generation_parameters(self) -> dict[str, Any]:
""" """
Gets the non-default generation parameters on the PretrainedConfig instance Gets the non-default generation parameters on the PreTrainedConfig instance
""" """
non_default_generation_parameters = {} non_default_generation_parameters = {}
decoder_attribute_name = None decoder_attribute_name = None
@ -1179,7 +1179,7 @@ class PretrainedConfig(PushToHubMixin):
return non_default_generation_parameters return non_default_generation_parameters
def get_text_config(self, decoder=None, encoder=None) -> "PretrainedConfig": def get_text_config(self, decoder=None, encoder=None) -> "PreTrainedConfig":
""" """
Returns the text config related to the text input (encoder) or text output (decoder) of the model. The Returns the text config related to the text input (encoder) or text output (decoder) of the model. The
`decoder` and `encoder` input arguments can be used to specify which end of the model we are interested in, `decoder` and `encoder` input arguments can be used to specify which end of the model we are interested in,
@ -1335,7 +1335,7 @@ def recursive_diff_dict(dict_a, dict_b, config_obj=None):
default = config_obj.__class__().to_dict() if config_obj is not None else {} default = config_obj.__class__().to_dict() if config_obj is not None else {}
for key, value in dict_a.items(): for key, value in dict_a.items():
obj_value = getattr(config_obj, str(key), None) obj_value = getattr(config_obj, str(key), None)
if isinstance(obj_value, PretrainedConfig) and key in dict_b and isinstance(dict_b[key], dict): if isinstance(obj_value, PreTrainedConfig) and key in dict_b and isinstance(dict_b[key], dict):
diff_value = recursive_diff_dict(value, dict_b[key], config_obj=obj_value) diff_value = recursive_diff_dict(value, dict_b[key], config_obj=obj_value)
diff[key] = diff_value diff[key] = diff_value
elif key not in dict_b or (value != default[key]): elif key not in dict_b or (value != default[key]):
@ -1343,13 +1343,17 @@ def recursive_diff_dict(dict_a, dict_b, config_obj=None):
return diff return diff
PretrainedConfig.push_to_hub = copy_func(PretrainedConfig.push_to_hub) PreTrainedConfig.push_to_hub = copy_func(PreTrainedConfig.push_to_hub)
if PretrainedConfig.push_to_hub.__doc__ is not None: if PreTrainedConfig.push_to_hub.__doc__ is not None:
PretrainedConfig.push_to_hub.__doc__ = PretrainedConfig.push_to_hub.__doc__.format( PreTrainedConfig.push_to_hub.__doc__ = PreTrainedConfig.push_to_hub.__doc__.format(
object="config", object_class="AutoConfig", object_files="configuration file" object="config", object_class="AutoConfig", object_files="configuration file"
) )
# The alias is only here for BC - we did not have the correct CamelCasing before
PretrainedConfig = PreTrainedConfig
ALLOWED_LAYER_TYPES = ( ALLOWED_LAYER_TYPES = (
"full_attention", "full_attention",
"sliding_attention", "sliding_attention",

View File

@ -613,7 +613,7 @@ def custom_object_save(obj: Any, folder: Union[str, os.PathLike], config: Option
Args: Args:
obj (`Any`): The object for which to save the module files. obj (`Any`): The object for which to save the module files.
folder (`str` or `os.PathLike`): The folder where to save. folder (`str` or `os.PathLike`): The folder where to save.
config (`PretrainedConfig` or dictionary, `optional`): config (`PreTrainedConfig` or dictionary, `optional`):
A config in which to register the auto_map corresponding to this custom object. A config in which to register the auto_map corresponding to this custom object.
Returns: Returns:

View File

@ -23,7 +23,7 @@ from dataclasses import dataclass, is_dataclass
from typing import TYPE_CHECKING, Any, Callable, Optional, Union from typing import TYPE_CHECKING, Any, Callable, Optional, Union
from .. import __version__ from .. import __version__
from ..configuration_utils import PretrainedConfig from ..configuration_utils import PreTrainedConfig
from ..utils import ( from ..utils import (
GENERATION_CONFIG_NAME, GENERATION_CONFIG_NAME,
ExplicitEnum, ExplicitEnum,
@ -1101,13 +1101,13 @@ class GenerationConfig(PushToHubMixin):
writer.write(self.to_json_string(use_diff=use_diff)) writer.write(self.to_json_string(use_diff=use_diff))
@classmethod @classmethod
def from_model_config(cls, model_config: PretrainedConfig) -> "GenerationConfig": def from_model_config(cls, model_config: PreTrainedConfig) -> "GenerationConfig":
""" """
Instantiates a [`GenerationConfig`] from a [`PretrainedConfig`]. This function is useful to convert legacy Instantiates a [`GenerationConfig`] from a [`PreTrainedConfig`]. This function is useful to convert legacy
[`PretrainedConfig`] objects, which may contain generation parameters, into a stand-alone [`GenerationConfig`]. [`PreTrainedConfig`] objects, which may contain generation parameters, into a stand-alone [`GenerationConfig`].
Args: Args:
model_config (`PretrainedConfig`): model_config (`PreTrainedConfig`):
The model config that will be used to instantiate the generation config. The model config that will be used to instantiate the generation config.
Returns: Returns:

View File

@ -18,14 +18,14 @@ from typing import Optional, Union
import torch import torch
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...generation.configuration_utils import GenerationConfig from ...generation.configuration_utils import GenerationConfig
from ...utils.metrics import attach_tracer, traced from ...utils.metrics import attach_tracer, traced
from .cache_manager import CacheAllocator, FullAttentionCacheAllocator, SlidingAttentionCacheAllocator from .cache_manager import CacheAllocator, FullAttentionCacheAllocator, SlidingAttentionCacheAllocator
from .requests import get_device_and_memory_breakdown, logger from .requests import get_device_and_memory_breakdown, logger
def group_layers_by_attn_type(config: PretrainedConfig) -> tuple[list[list[int]], list[str]]: def group_layers_by_attn_type(config: PreTrainedConfig) -> tuple[list[list[int]], list[str]]:
""" """
Group layers depending on the attention mix, according to VLLM's hybrid allocator rules: Group layers depending on the attention mix, according to VLLM's hybrid allocator rules:
- Layers in each group need to have the same type of attention - Layers in each group need to have the same type of attention
@ -119,7 +119,7 @@ class PagedAttentionCache:
# TODO: this init is quite long, maybe a refactor is in order # TODO: this init is quite long, maybe a refactor is in order
def __init__( def __init__(
self, self,
config: PretrainedConfig, config: PreTrainedConfig,
generation_config: GenerationConfig, generation_config: GenerationConfig,
device: torch.device, device: torch.device,
dtype: torch.dtype = torch.float16, dtype: torch.dtype = torch.float16,

View File

@ -25,7 +25,7 @@ import torch
from torch import nn from torch import nn
from tqdm import tqdm from tqdm import tqdm
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...generation.configuration_utils import GenerationConfig from ...generation.configuration_utils import GenerationConfig
from ...utils.logging import logging from ...utils.logging import logging
from ...utils.metrics import ContinuousBatchProcessorMetrics, attach_tracer, traced from ...utils.metrics import ContinuousBatchProcessorMetrics, attach_tracer, traced
@ -140,7 +140,7 @@ class ContinuousBatchProcessor:
def __init__( def __init__(
self, self,
cache: PagedAttentionCache, cache: PagedAttentionCache,
config: PretrainedConfig, config: PreTrainedConfig,
generation_config: GenerationConfig, generation_config: GenerationConfig,
input_queue: queue.Queue, input_queue: queue.Queue,
output_queue: queue.Queue, output_queue: queue.Queue,

View File

@ -25,7 +25,7 @@ from torch.nn import BCELoss
from ..modeling_utils import PreTrainedModel from ..modeling_utils import PreTrainedModel
from ..utils import ModelOutput, logging from ..utils import ModelOutput, logging
from .configuration_utils import PretrainedConfig, WatermarkingConfig from .configuration_utils import PreTrainedConfig, WatermarkingConfig
from .logits_process import SynthIDTextWatermarkLogitsProcessor, WatermarkLogitsProcessor from .logits_process import SynthIDTextWatermarkLogitsProcessor, WatermarkLogitsProcessor
@ -75,7 +75,7 @@ class WatermarkDetector:
See [the paper](https://huggingface.co/papers/2306.04634) for more information. See [the paper](https://huggingface.co/papers/2306.04634) for more information.
Args: Args:
model_config (`PretrainedConfig`): model_config (`PreTrainedConfig`):
The model config that will be used to get model specific arguments used when generating. The model config that will be used to get model specific arguments used when generating.
device (`str`): device (`str`):
The device which was used during watermarked text generation. The device which was used during watermarked text generation.
@ -119,7 +119,7 @@ class WatermarkDetector:
def __init__( def __init__(
self, self,
model_config: PretrainedConfig, model_config: PreTrainedConfig,
device: str, device: str,
watermarking_config: Union[WatermarkingConfig, dict], watermarking_config: Union[WatermarkingConfig, dict],
ignore_repeated_ngrams: bool = False, ignore_repeated_ngrams: bool = False,
@ -237,13 +237,13 @@ class WatermarkDetector:
return prediction return prediction
class BayesianDetectorConfig(PretrainedConfig): class BayesianDetectorConfig(PreTrainedConfig):
""" """
This is the configuration class to store the configuration of a [`BayesianDetectorModel`]. It is used to This is the configuration class to store the configuration of a [`BayesianDetectorModel`]. It is used to
instantiate a Bayesian Detector model according to the specified arguments. instantiate a Bayesian Detector model according to the specified arguments.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
watermarking_depth (`int`, *optional*): watermarking_depth (`int`, *optional*):

View File

@ -19,7 +19,7 @@ import torch
import torch.nn.functional as F import torch.nn.functional as F
from .cache_utils import Cache from .cache_utils import Cache
from .configuration_utils import PretrainedConfig from .configuration_utils import PreTrainedConfig
from .utils import is_torch_xpu_available, logging from .utils import is_torch_xpu_available, logging
from .utils.generic import GeneralInterface from .utils.generic import GeneralInterface
from .utils.import_utils import is_torch_flex_attn_available, is_torch_greater_or_equal, is_torchdynamo_compiling from .utils.import_utils import is_torch_flex_attn_available, is_torch_greater_or_equal, is_torchdynamo_compiling
@ -662,7 +662,7 @@ def find_packed_sequence_indices(position_ids: torch.Tensor) -> torch.Tensor:
def _preprocess_mask_arguments( def _preprocess_mask_arguments(
config: PretrainedConfig, config: PreTrainedConfig,
input_embeds: torch.Tensor, input_embeds: torch.Tensor,
attention_mask: Optional[Union[torch.Tensor, BlockMask]], attention_mask: Optional[Union[torch.Tensor, BlockMask]],
cache_position: torch.Tensor, cache_position: torch.Tensor,
@ -675,7 +675,7 @@ def _preprocess_mask_arguments(
key-value length and offsets, and if we should early exit or not. key-value length and offsets, and if we should early exit or not.
Args: Args:
config (`PretrainedConfig`): config (`PreTrainedConfig`):
The model config. The model config.
input_embeds (`torch.Tensor`): input_embeds (`torch.Tensor`):
The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the
@ -743,7 +743,7 @@ def _preprocess_mask_arguments(
def create_causal_mask( def create_causal_mask(
config: PretrainedConfig, config: PreTrainedConfig,
input_embeds: torch.Tensor, input_embeds: torch.Tensor,
attention_mask: Optional[torch.Tensor], attention_mask: Optional[torch.Tensor],
cache_position: torch.Tensor, cache_position: torch.Tensor,
@ -758,7 +758,7 @@ def create_causal_mask(
to what is needed in the `modeling_xxx.py` files). to what is needed in the `modeling_xxx.py` files).
Args: Args:
config (`PretrainedConfig`): config (`PreTrainedConfig`):
The model config. The model config.
input_embeds (`torch.Tensor`): input_embeds (`torch.Tensor`):
The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the
@ -837,7 +837,7 @@ def create_causal_mask(
def create_sliding_window_causal_mask( def create_sliding_window_causal_mask(
config: PretrainedConfig, config: PreTrainedConfig,
input_embeds: torch.Tensor, input_embeds: torch.Tensor,
attention_mask: Optional[torch.Tensor], attention_mask: Optional[torch.Tensor],
cache_position: torch.Tensor, cache_position: torch.Tensor,
@ -853,7 +853,7 @@ def create_sliding_window_causal_mask(
`modeling_xxx.py` files). `modeling_xxx.py` files).
Args: Args:
config (`PretrainedConfig`): config (`PreTrainedConfig`):
The model config. The model config.
input_embeds (`torch.Tensor`): input_embeds (`torch.Tensor`):
The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the
@ -934,7 +934,7 @@ def create_sliding_window_causal_mask(
def create_chunked_causal_mask( def create_chunked_causal_mask(
config: PretrainedConfig, config: PreTrainedConfig,
input_embeds: torch.Tensor, input_embeds: torch.Tensor,
attention_mask: Optional[torch.Tensor], attention_mask: Optional[torch.Tensor],
cache_position: torch.Tensor, cache_position: torch.Tensor,
@ -950,7 +950,7 @@ def create_chunked_causal_mask(
`modeling_xxx.py` files). `modeling_xxx.py` files).
Args: Args:
config (`PretrainedConfig`): config (`PreTrainedConfig`):
The model config. The model config.
input_embeds (`torch.Tensor`): input_embeds (`torch.Tensor`):
The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the
@ -1063,7 +1063,7 @@ LAYER_PATTERN_TO_MASK_FUNCTION_MAPPING = {
def create_masks_for_generate( def create_masks_for_generate(
config: PretrainedConfig, config: PreTrainedConfig,
input_embeds: torch.Tensor, input_embeds: torch.Tensor,
attention_mask: Optional[torch.Tensor], attention_mask: Optional[torch.Tensor],
cache_position: torch.Tensor, cache_position: torch.Tensor,
@ -1078,7 +1078,7 @@ def create_masks_for_generate(
in order to easily create the masks in advance, when we compile the forwards with Static caches. in order to easily create the masks in advance, when we compile the forwards with Static caches.
Args: Args:
config (`PretrainedConfig`): config (`PreTrainedConfig`):
The model config. The model config.
input_embeds (`torch.Tensor`): input_embeds (`torch.Tensor`):
The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the

View File

@ -16,7 +16,7 @@ import math
from functools import wraps from functools import wraps
from typing import Optional from typing import Optional
from .configuration_utils import PretrainedConfig from .configuration_utils import PreTrainedConfig
from .utils import is_torch_available, logging from .utils import is_torch_available, logging
@ -90,14 +90,14 @@ def dynamic_rope_update(rope_forward):
def _compute_default_rope_parameters( def _compute_default_rope_parameters(
config: Optional[PretrainedConfig] = None, config: Optional[PreTrainedConfig] = None,
device: Optional["torch.device"] = None, device: Optional["torch.device"] = None,
seq_len: Optional[int] = None, seq_len: Optional[int] = None,
) -> tuple["torch.Tensor", float]: ) -> tuple["torch.Tensor", float]:
""" """
Computes the inverse frequencies according to the original RoPE implementation Computes the inverse frequencies according to the original RoPE implementation
Args: Args:
config ([`~transformers.PretrainedConfig`]): config ([`~transformers.PreTrainedConfig`]):
The model configuration. This function assumes that the config will provide at least the following The model configuration. This function assumes that the config will provide at least the following
properties: properties:
@ -133,14 +133,14 @@ def _compute_default_rope_parameters(
def _compute_linear_scaling_rope_parameters( def _compute_linear_scaling_rope_parameters(
config: Optional[PretrainedConfig] = None, config: Optional[PreTrainedConfig] = None,
device: Optional["torch.device"] = None, device: Optional["torch.device"] = None,
seq_len: Optional[int] = None, seq_len: Optional[int] = None,
) -> tuple["torch.Tensor", float]: ) -> tuple["torch.Tensor", float]:
""" """
Computes the inverse frequencies with linear scaling. Credits to the Reddit user /u/kaiokendev Computes the inverse frequencies with linear scaling. Credits to the Reddit user /u/kaiokendev
Args: Args:
config ([`~transformers.PretrainedConfig`]): config ([`~transformers.PreTrainedConfig`]):
The model configuration. This function assumes that the config will provide at least the following The model configuration. This function assumes that the config will provide at least the following
properties: properties:
@ -176,7 +176,7 @@ def _compute_linear_scaling_rope_parameters(
def _compute_dynamic_ntk_parameters( def _compute_dynamic_ntk_parameters(
config: Optional[PretrainedConfig] = None, config: Optional[PreTrainedConfig] = None,
device: Optional["torch.device"] = None, device: Optional["torch.device"] = None,
seq_len: Optional[int] = None, seq_len: Optional[int] = None,
) -> tuple["torch.Tensor", float]: ) -> tuple["torch.Tensor", float]:
@ -184,7 +184,7 @@ def _compute_dynamic_ntk_parameters(
Computes the inverse frequencies with NTK scaling. Credits to the Reddit users /u/bloc97 and /u/emozilla Computes the inverse frequencies with NTK scaling. Credits to the Reddit users /u/bloc97 and /u/emozilla
Args: Args:
config ([`~transformers.PretrainedConfig`]): config ([`~transformers.PreTrainedConfig`]):
The model configuration. This function assumes that the config will provide at least the following The model configuration. This function assumes that the config will provide at least the following
properties: properties:
@ -244,14 +244,14 @@ def _compute_dynamic_ntk_parameters(
def _compute_yarn_parameters( def _compute_yarn_parameters(
config: PretrainedConfig, device: "torch.device", seq_len: Optional[int] = None config: PreTrainedConfig, device: "torch.device", seq_len: Optional[int] = None
) -> tuple["torch.Tensor", float]: ) -> tuple["torch.Tensor", float]:
""" """
Computes the inverse frequencies with NTK scaling. Please refer to the Computes the inverse frequencies with NTK scaling. Please refer to the
[original paper](https://huggingface.co/papers/2309.00071) [original paper](https://huggingface.co/papers/2309.00071)
Args: Args:
config ([`~transformers.PretrainedConfig`]): config ([`~transformers.PreTrainedConfig`]):
The model configuration. This function assumes that the config will provide at least the following The model configuration. This function assumes that the config will provide at least the following
properties: properties:
@ -369,14 +369,14 @@ def _compute_yarn_parameters(
def _compute_longrope_parameters( def _compute_longrope_parameters(
config: PretrainedConfig, device: "torch.device", seq_len: Optional[int] = None config: PreTrainedConfig, device: "torch.device", seq_len: Optional[int] = None
) -> tuple["torch.Tensor", float]: ) -> tuple["torch.Tensor", float]:
""" """
Computes the inverse frequencies with LongRoPE scaling. Please refer to the Computes the inverse frequencies with LongRoPE scaling. Please refer to the
[original implementation](https://github.com/microsoft/LongRoPE) [original implementation](https://github.com/microsoft/LongRoPE)
Args: Args:
config ([`~transformers.PretrainedConfig`]): config ([`~transformers.PreTrainedConfig`]):
The model configuration. This function assumes that the config will provide at least the following The model configuration. This function assumes that the config will provide at least the following
properties: properties:
@ -451,13 +451,13 @@ def _compute_longrope_parameters(
def _compute_llama3_parameters( def _compute_llama3_parameters(
config: PretrainedConfig, device: "torch.device", seq_len: Optional[int] = None config: PreTrainedConfig, device: "torch.device", seq_len: Optional[int] = None
) -> tuple["torch.Tensor", float]: ) -> tuple["torch.Tensor", float]:
""" """
Computes the inverse frequencies for llama 3.1. Computes the inverse frequencies for llama 3.1.
Args: Args:
config ([`~transformers.PretrainedConfig`]): config ([`~transformers.PreTrainedConfig`]):
The model configuration. This function assumes that the config will provide at least the following The model configuration. This function assumes that the config will provide at least the following
properties: properties:
@ -557,7 +557,7 @@ def _check_received_keys(
logger.warning(f"Unrecognized keys in `rope_scaling` for 'rope_type'='{rope_type}': {unused_keys}") logger.warning(f"Unrecognized keys in `rope_scaling` for 'rope_type'='{rope_type}': {unused_keys}")
def _validate_default_rope_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None): def _validate_default_rope_parameters(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
rope_scaling = config.rope_scaling rope_scaling = config.rope_scaling
rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type" rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type"
required_keys = {"rope_type"} required_keys = {"rope_type"}
@ -565,7 +565,7 @@ def _validate_default_rope_parameters(config: PretrainedConfig, ignore_keys: Opt
_check_received_keys(rope_type, received_keys, required_keys, ignore_keys=ignore_keys) _check_received_keys(rope_type, received_keys, required_keys, ignore_keys=ignore_keys)
def _validate_linear_scaling_rope_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None): def _validate_linear_scaling_rope_parameters(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
rope_scaling = config.rope_scaling rope_scaling = config.rope_scaling
rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type" rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type"
required_keys = {"rope_type", "factor"} required_keys = {"rope_type", "factor"}
@ -577,7 +577,7 @@ def _validate_linear_scaling_rope_parameters(config: PretrainedConfig, ignore_ke
logger.warning(f"`rope_scaling`'s factor field must be a float >= 1, got {factor}") logger.warning(f"`rope_scaling`'s factor field must be a float >= 1, got {factor}")
def _validate_dynamic_scaling_rope_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None): def _validate_dynamic_scaling_rope_parameters(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
rope_scaling = config.rope_scaling rope_scaling = config.rope_scaling
rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type" rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type"
required_keys = {"rope_type", "factor"} required_keys = {"rope_type", "factor"}
@ -591,7 +591,7 @@ def _validate_dynamic_scaling_rope_parameters(config: PretrainedConfig, ignore_k
logger.warning(f"`rope_scaling`'s factor field must be a float >= 1, got {factor}") logger.warning(f"`rope_scaling`'s factor field must be a float >= 1, got {factor}")
def _validate_yarn_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None): def _validate_yarn_parameters(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
rope_scaling = config.rope_scaling rope_scaling = config.rope_scaling
rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type" rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type"
required_keys = {"rope_type", "factor"} required_keys = {"rope_type", "factor"}
@ -657,7 +657,7 @@ def _validate_yarn_parameters(config: PretrainedConfig, ignore_keys: Optional[se
) )
def _validate_longrope_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None): def _validate_longrope_parameters(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
rope_scaling = config.rope_scaling rope_scaling = config.rope_scaling
rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type" rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type"
required_keys = {"rope_type", "short_factor", "long_factor"} required_keys = {"rope_type", "short_factor", "long_factor"}
@ -707,7 +707,7 @@ def _validate_longrope_parameters(config: PretrainedConfig, ignore_keys: Optiona
) )
def _validate_llama3_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None): def _validate_llama3_parameters(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
rope_scaling = config.rope_scaling rope_scaling = config.rope_scaling
rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type" rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type"
required_keys = {"rope_type", "factor", "original_max_position_embeddings", "low_freq_factor", "high_freq_factor"} required_keys = {"rope_type", "factor", "original_max_position_embeddings", "low_freq_factor", "high_freq_factor"}
@ -754,11 +754,11 @@ ROPE_VALIDATION_FUNCTIONS = {
} }
def rope_config_validation(config: PretrainedConfig, ignore_keys: Optional[set] = None): def rope_config_validation(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
""" """
Validate the RoPE config arguments, given a `PretrainedConfig` object Validate the RoPE config arguments, given a `PreTrainedConfig` object
""" """
rope_scaling = getattr(config, "rope_scaling", None) # not a default parameter in `PretrainedConfig` rope_scaling = getattr(config, "rope_scaling", None) # not a default parameter in `PreTrainedConfig`
if rope_scaling is None: if rope_scaling is None:
return return

View File

@ -44,7 +44,7 @@ from torch import Tensor, nn
from torch.distributions import constraints from torch.distributions import constraints
from torch.utils.checkpoint import checkpoint from torch.utils.checkpoint import checkpoint
from .configuration_utils import PretrainedConfig from .configuration_utils import PreTrainedConfig
from .distributed import DistributedConfig from .distributed import DistributedConfig
from .dynamic_module_utils import custom_object_save from .dynamic_module_utils import custom_object_save
from .generation import CompileConfig, GenerationConfig from .generation import CompileConfig, GenerationConfig
@ -1149,11 +1149,11 @@ def _get_dtype(
cls, cls,
dtype: Optional[Union[str, torch.dtype, dict]], dtype: Optional[Union[str, torch.dtype, dict]],
checkpoint_files: Optional[list[str]], checkpoint_files: Optional[list[str]],
config: PretrainedConfig, config: PreTrainedConfig,
sharded_metadata: Optional[dict], sharded_metadata: Optional[dict],
state_dict: Optional[dict], state_dict: Optional[dict],
weights_only: bool, weights_only: bool,
) -> tuple[PretrainedConfig, Optional[torch.dtype], Optional[torch.dtype]]: ) -> tuple[PreTrainedConfig, Optional[torch.dtype], Optional[torch.dtype]]:
"""Find the correct `dtype` to use based on provided arguments. Also update the `config` based on the """Find the correct `dtype` to use based on provided arguments. Also update the `config` based on the
inferred dtype. We do the following: inferred dtype. We do the following:
1. If dtype is not None, we use that dtype 1. If dtype is not None, we use that dtype
@ -1780,7 +1780,7 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
Class attributes (overridden by derived classes): Class attributes (overridden by derived classes):
- **config_class** ([`PretrainedConfig`]) -- A subclass of [`PretrainedConfig`] to use as configuration class - **config_class** ([`PreTrainedConfig`]) -- A subclass of [`PreTrainedConfig`] to use as configuration class
for this model architecture. for this model architecture.
- **base_model_prefix** (`str`) -- A string indicating the attribute associated to the base model in derived - **base_model_prefix** (`str`) -- A string indicating the attribute associated to the base model in derived
classes of the same architecture adding modules on top of the base model. classes of the same architecture adding modules on top of the base model.
@ -1935,12 +1935,12 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
elif full_annotation is not None: elif full_annotation is not None:
cls.config_class = full_annotation cls.config_class = full_annotation
def __init__(self, config: PretrainedConfig, *inputs, **kwargs): def __init__(self, config: PreTrainedConfig, *inputs, **kwargs):
super().__init__() super().__init__()
if not isinstance(config, PretrainedConfig): if not isinstance(config, PreTrainedConfig):
raise TypeError( raise TypeError(
f"Parameter config in `{self.__class__.__name__}(config)` should be an instance of class " f"Parameter config in `{self.__class__.__name__}(config)` should be an instance of class "
"`PretrainedConfig`. To create a model from a pretrained model use " "`PreTrainedConfig`. To create a model from a pretrained model use "
f"`model = {self.__class__.__name__}.from_pretrained(PRETRAINED_MODEL_NAME)`" f"`model = {self.__class__.__name__}.from_pretrained(PRETRAINED_MODEL_NAME)`"
) )
self.config = config self.config = config
@ -4250,7 +4250,7 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
cls: type[SpecificPreTrainedModelType], cls: type[SpecificPreTrainedModelType],
pretrained_model_name_or_path: Optional[Union[str, os.PathLike]], pretrained_model_name_or_path: Optional[Union[str, os.PathLike]],
*model_args, *model_args,
config: Optional[Union[PretrainedConfig, str, os.PathLike]] = None, config: Optional[Union[PreTrainedConfig, str, os.PathLike]] = None,
cache_dir: Optional[Union[str, os.PathLike]] = None, cache_dir: Optional[Union[str, os.PathLike]] = None,
ignore_mismatched_sizes: bool = False, ignore_mismatched_sizes: bool = False,
force_download: bool = False, force_download: bool = False,
@ -4285,11 +4285,11 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
arguments `config` and `state_dict`). arguments `config` and `state_dict`).
model_args (sequence of positional arguments, *optional*): model_args (sequence of positional arguments, *optional*):
All remaining positional arguments will be passed to the underlying model's `__init__` method. All remaining positional arguments will be passed to the underlying model's `__init__` method.
config (`Union[PretrainedConfig, str, os.PathLike]`, *optional*): config (`Union[PreTrainedConfig, str, os.PathLike]`, *optional*):
Can be either: Can be either:
- an instance of a class derived from [`PretrainedConfig`], - an instance of a class derived from [`PreTrainedConfig`],
- a string or path valid as input to [`~PretrainedConfig.from_pretrained`]. - a string or path valid as input to [`~PreTrainedConfig.from_pretrained`].
Configuration for the model to use instead of an automatically loaded configuration. Configuration can Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when: be automatically loaded when:
@ -4437,7 +4437,7 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
underlying model's `__init__` method (we assume all relevant updates to the configuration have underlying model's `__init__` method (we assume all relevant updates to the configuration have
already been done) already been done)
- If a configuration is not provided, `kwargs` will be first passed to the configuration class - If a configuration is not provided, `kwargs` will be first passed to the configuration class
initialization function ([`~PretrainedConfig.from_pretrained`]). Each key of `kwargs` that initialization function ([`~PreTrainedConfig.from_pretrained`]). Each key of `kwargs` that
corresponds to a configuration attribute will be used to override said attribute with the corresponds to a configuration attribute will be used to override said attribute with the
supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute
will be passed to the underlying model's `__init__` function. will be passed to the underlying model's `__init__` function.
@ -4574,7 +4574,7 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
raise ValueError("accelerate is required when loading a GGUF file `pip install accelerate`.") raise ValueError("accelerate is required when loading a GGUF file `pip install accelerate`.")
if commit_hash is None: if commit_hash is None:
if not isinstance(config, PretrainedConfig): if not isinstance(config, PreTrainedConfig):
# We make a call to the config file first (which may be absent) to get the commit hash as soon as possible # We make a call to the config file first (which may be absent) to get the commit hash as soon as possible
resolved_config_file = cached_file( resolved_config_file = cached_file(
pretrained_model_name_or_path, pretrained_model_name_or_path,
@ -4681,7 +4681,7 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
local_files_only = True local_files_only = True
# Load config if we don't provide a configuration # Load config if we don't provide a configuration
if not isinstance(config, PretrainedConfig): if not isinstance(config, PreTrainedConfig):
config_path = config if config is not None else pretrained_model_name_or_path config_path = config if config is not None else pretrained_model_name_or_path
config, model_kwargs = cls.config_class.from_pretrained( config, model_kwargs = cls.config_class.from_pretrained(
config_path, config_path,

View File

@ -21,22 +21,22 @@
from typing import Optional from typing import Optional
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class Aimv2VisionConfig(PretrainedConfig): class Aimv2VisionConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`Aimv2VisionModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`Aimv2VisionModel`]. It is used to instantiate a
AIMv2 vision encoder according to the specified arguments, defining the model architecture. Instantiating a AIMv2 vision encoder according to the specified arguments, defining the model architecture. Instantiating a
configuration with the defaults will yield a similar configuration to that of the vision encoder of the AIMv2 configuration with the defaults will yield a similar configuration to that of the vision encoder of the AIMv2
[apple/aimv2-large-patch14-224](https://huggingface.co/apple/aimv2-large-patch14-224) architecture. [apple/aimv2-large-patch14-224](https://huggingface.co/apple/aimv2-large-patch14-224) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
hidden_size (`int`, *optional*, defaults to 1024): hidden_size (`int`, *optional*, defaults to 1024):
@ -127,15 +127,15 @@ class Aimv2VisionConfig(PretrainedConfig):
self.is_native = is_native self.is_native = is_native
class Aimv2TextConfig(PretrainedConfig): class Aimv2TextConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`Aimv2TextModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`Aimv2TextModel`]. It is used to instantiate a
AIMv2 text encoder according to the specified arguments, defining the model architecture. Instantiating a AIMv2 text encoder according to the specified arguments, defining the model architecture. Instantiating a
configuration with the defaults will yield a similar configuration to that of the text encoder of the AIMv2 configuration with the defaults will yield a similar configuration to that of the text encoder of the AIMv2
[apple/aimv2-large-patch14-224-lit](https://huggingface.co/apple/aimv2-large-patch14-224-lit) architecture. [apple/aimv2-large-patch14-224-lit](https://huggingface.co/apple/aimv2-large-patch14-224-lit) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 49408): vocab_size (`int`, *optional*, defaults to 49408):
@ -212,15 +212,15 @@ class Aimv2TextConfig(PretrainedConfig):
self.rms_norm_eps = rms_norm_eps self.rms_norm_eps = rms_norm_eps
class Aimv2Config(PretrainedConfig): class Aimv2Config(PreTrainedConfig):
r""" r"""
[`Aimv2Config`] is the configuration class to store the configuration of a [`Aimv2Model`]. It is used to [`Aimv2Config`] is the configuration class to store the configuration of a [`Aimv2Model`]. It is used to
instantiate a AIMv2 model according to the specified arguments, defining the text model and vision model configs. instantiate a AIMv2 model according to the specified arguments, defining the text model and vision model configs.
Instantiating a configuration with the defaults will yield a similar configuration to that of the AIMv2 Instantiating a configuration with the defaults will yield a similar configuration to that of the AIMv2
[apple/aimv2-large-patch14-224-lit](https://huggingface.co/apple/aimv2-large-patch14-224-lit) architecture. [apple/aimv2-large-patch14-224-lit](https://huggingface.co/apple/aimv2-large-patch14-224-lit) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
text_config (`dict`, *optional*): text_config (`dict`, *optional*):

View File

@ -47,8 +47,8 @@ class Aimv2VisionConfig(SiglipVisionConfig):
configuration with the defaults will yield a similar configuration to that of the vision encoder of the AIMv2 configuration with the defaults will yield a similar configuration to that of the vision encoder of the AIMv2
[apple/aimv2-large-patch14-224](https://huggingface.co/apple/aimv2-large-patch14-224) architecture. [apple/aimv2-large-patch14-224](https://huggingface.co/apple/aimv2-large-patch14-224) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
hidden_size (`int`, *optional*, defaults to 1024): hidden_size (`int`, *optional*, defaults to 1024):
@ -147,8 +147,8 @@ class Aimv2TextConfig(SiglipTextConfig):
configuration with the defaults will yield a similar configuration to that of the text encoder of the AIMv2 configuration with the defaults will yield a similar configuration to that of the text encoder of the AIMv2
[apple/aimv2-large-patch14-224-lit](https://huggingface.co/apple/aimv2-large-patch14-224-lit) architecture. [apple/aimv2-large-patch14-224-lit](https://huggingface.co/apple/aimv2-large-patch14-224-lit) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 49408): vocab_size (`int`, *optional*, defaults to 49408):
@ -238,8 +238,8 @@ class Aimv2Config(SiglipConfig):
Instantiating a configuration with the defaults will yield a similar configuration to that of the AIMv2 Instantiating a configuration with the defaults will yield a similar configuration to that of the AIMv2
[apple/aimv2-large-patch14-224-lit](https://huggingface.co/apple/aimv2-large-patch14-224-lit) architecture. [apple/aimv2-large-patch14-224-lit](https://huggingface.co/apple/aimv2-large-patch14-224-lit) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
text_config (`dict`, *optional*): text_config (`dict`, *optional*):

View File

@ -18,19 +18,19 @@
from collections import OrderedDict from collections import OrderedDict
from collections.abc import Mapping from collections.abc import Mapping
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig from ...onnx import OnnxConfig
class AlbertConfig(PretrainedConfig): class AlbertConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`AlbertModel`] or a [`TFAlbertModel`]. It is used This is the configuration class to store the configuration of a [`AlbertModel`] or a [`TFAlbertModel`]. It is used
to instantiate an ALBERT model according to the specified arguments, defining the model architecture. Instantiating to instantiate an ALBERT model according to the specified arguments, defining the model architecture. Instantiating
a configuration with the defaults will yield a similar configuration to that of the ALBERT a configuration with the defaults will yield a similar configuration to that of the ALBERT
[albert/albert-xxlarge-v2](https://huggingface.co/albert/albert-xxlarge-v2) architecture. [albert/albert-xxlarge-v2](https://huggingface.co/albert/albert-xxlarge-v2) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 30000): vocab_size (`int`, *optional*, defaults to 30000):

View File

@ -14,14 +14,14 @@
# limitations under the License. # limitations under the License.
"""ALIGN model configuration""" """ALIGN model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class AlignTextConfig(PretrainedConfig): class AlignTextConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`AlignTextModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`AlignTextModel`]. It is used to instantiate a
ALIGN text encoder according to the specified arguments, defining the model architecture. Instantiating a ALIGN text encoder according to the specified arguments, defining the model architecture. Instantiating a
@ -29,8 +29,8 @@ class AlignTextConfig(PretrainedConfig):
[kakaobrain/align-base](https://huggingface.co/kakaobrain/align-base) architecture. The default values here are [kakaobrain/align-base](https://huggingface.co/kakaobrain/align-base) architecture. The default values here are
copied from BERT. copied from BERT.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 30522): vocab_size (`int`, *optional*, defaults to 30522):
@ -128,7 +128,7 @@ class AlignTextConfig(PretrainedConfig):
self.pad_token_id = pad_token_id self.pad_token_id = pad_token_id
class AlignVisionConfig(PretrainedConfig): class AlignVisionConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`AlignVisionModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`AlignVisionModel`]. It is used to instantiate a
ALIGN vision encoder according to the specified arguments, defining the model architecture. Instantiating a ALIGN vision encoder according to the specified arguments, defining the model architecture. Instantiating a
@ -136,8 +136,8 @@ class AlignVisionConfig(PretrainedConfig):
[kakaobrain/align-base](https://huggingface.co/kakaobrain/align-base) architecture. The default values are copied [kakaobrain/align-base](https://huggingface.co/kakaobrain/align-base) architecture. The default values are copied
from EfficientNet (efficientnet-b7) from EfficientNet (efficientnet-b7)
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
num_channels (`int`, *optional*, defaults to 3): num_channels (`int`, *optional*, defaults to 3):
@ -250,15 +250,15 @@ class AlignVisionConfig(PretrainedConfig):
self.num_hidden_layers = sum(num_block_repeats) * 4 self.num_hidden_layers = sum(num_block_repeats) * 4
class AlignConfig(PretrainedConfig): class AlignConfig(PreTrainedConfig):
r""" r"""
[`AlignConfig`] is the configuration class to store the configuration of a [`AlignModel`]. It is used to [`AlignConfig`] is the configuration class to store the configuration of a [`AlignModel`]. It is used to
instantiate a ALIGN model according to the specified arguments, defining the text model and vision model configs. instantiate a ALIGN model according to the specified arguments, defining the text model and vision model configs.
Instantiating a configuration with the defaults will yield a similar configuration to that of the ALIGN Instantiating a configuration with the defaults will yield a similar configuration to that of the ALIGN
[kakaobrain/align-base](https://huggingface.co/kakaobrain/align-base) architecture. [kakaobrain/align-base](https://huggingface.co/kakaobrain/align-base) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
text_config (`dict`, *optional*): text_config (`dict`, *optional*):

View File

@ -14,22 +14,22 @@
# limitations under the License. # limitations under the License.
"""AltCLIP model configuration""" """AltCLIP model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class AltCLIPTextConfig(PretrainedConfig): class AltCLIPTextConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`AltCLIPTextModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`AltCLIPTextModel`]. It is used to instantiate a
AltCLIP text model according to the specified arguments, defining the model architecture. Instantiating a AltCLIP text model according to the specified arguments, defining the model architecture. Instantiating a
configuration with the defaults will yield a similar configuration to that of the AltCLIP configuration with the defaults will yield a similar configuration to that of the AltCLIP
[BAAI/AltCLIP](https://huggingface.co/BAAI/AltCLIP) architecture. [BAAI/AltCLIP](https://huggingface.co/BAAI/AltCLIP) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
@ -139,15 +139,15 @@ class AltCLIPTextConfig(PretrainedConfig):
self.project_dim = project_dim self.project_dim = project_dim
class AltCLIPVisionConfig(PretrainedConfig): class AltCLIPVisionConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`AltCLIPModel`]. It is used to instantiate an This is the configuration class to store the configuration of a [`AltCLIPModel`]. It is used to instantiate an
AltCLIP model according to the specified arguments, defining the model architecture. Instantiating a configuration AltCLIP model according to the specified arguments, defining the model architecture. Instantiating a configuration
with the defaults will yield a similar configuration to that of the AltCLIP with the defaults will yield a similar configuration to that of the AltCLIP
[BAAI/AltCLIP](https://huggingface.co/BAAI/AltCLIP) architecture. [BAAI/AltCLIP](https://huggingface.co/BAAI/AltCLIP) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
@ -232,15 +232,15 @@ class AltCLIPVisionConfig(PretrainedConfig):
self.hidden_act = hidden_act self.hidden_act = hidden_act
class AltCLIPConfig(PretrainedConfig): class AltCLIPConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`AltCLIPModel`]. It is used to instantiate an This is the configuration class to store the configuration of a [`AltCLIPModel`]. It is used to instantiate an
AltCLIP model according to the specified arguments, defining the model architecture. Instantiating a configuration AltCLIP model according to the specified arguments, defining the model architecture. Instantiating a configuration
with the defaults will yield a similar configuration to that of the AltCLIP with the defaults will yield a similar configuration to that of the AltCLIP
[BAAI/AltCLIP](https://huggingface.co/BAAI/AltCLIP) architecture. [BAAI/AltCLIP](https://huggingface.co/BAAI/AltCLIP) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
text_config (`dict`, *optional*): text_config (`dict`, *optional*):

View File

@ -20,19 +20,19 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...modeling_rope_utils import rope_config_validation from ...modeling_rope_utils import rope_config_validation
class ApertusConfig(PretrainedConfig): class ApertusConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`ApertusModel`]. It is used to instantiate a Apertus This is the configuration class to store the configuration of a [`ApertusModel`]. It is used to instantiate a Apertus
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
defaults will yield a similar configuration to that of the Apertus-8B. defaults will yield a similar configuration to that of the Apertus-8B.
e.g. [swiss-ai/Apertus-8B](https://huggingface.co/swiss-ai/Apertus-8B) e.g. [swiss-ai/Apertus-8B](https://huggingface.co/swiss-ai/Apertus-8B)
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -48,8 +48,8 @@ class ApertusConfig(LlamaConfig):
defaults will yield a similar configuration to that of the Apertus-8B. defaults will yield a similar configuration to that of the Apertus-8B.
e.g. [swiss-ai/Apertus-8B](https://huggingface.co/swiss-ai/Apertus-8B) e.g. [swiss-ai/Apertus-8B](https://huggingface.co/swiss-ai/Apertus-8B)
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -19,11 +19,11 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...modeling_rope_utils import rope_config_validation from ...modeling_rope_utils import rope_config_validation
class ArceeConfig(PretrainedConfig): class ArceeConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`ArceeModel`]. It is used to instantiate an Arcee This is the configuration class to store the configuration of a [`ArceeModel`]. It is used to instantiate an Arcee
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
@ -33,8 +33,8 @@ class ArceeConfig(PretrainedConfig):
[arcee-ai/AFM-4.5B](https://huggingface.co/arcee-ai/AFM-4.5B) [arcee-ai/AFM-4.5B](https://huggingface.co/arcee-ai/AFM-4.5B)
and were used to build the examples below. and were used to build the examples below.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 32000): vocab_size (`int`, *optional*, defaults to 32000):

View File

@ -39,8 +39,8 @@ class ArceeConfig(LlamaConfig):
[arcee-ai/AFM-4.5B](https://huggingface.co/arcee-ai/AFM-4.5B) [arcee-ai/AFM-4.5B](https://huggingface.co/arcee-ai/AFM-4.5B)
and were used to build the examples below. and were used to build the examples below.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 32000): vocab_size (`int`, *optional*, defaults to 32000):

View File

@ -20,12 +20,12 @@
# limitations under the License. # limitations under the License.
from typing import Optional from typing import Optional
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...modeling_rope_utils import rope_config_validation from ...modeling_rope_utils import rope_config_validation
from ..auto import CONFIG_MAPPING, AutoConfig from ..auto import CONFIG_MAPPING, AutoConfig
class AriaTextConfig(PretrainedConfig): class AriaTextConfig(PreTrainedConfig):
r""" r"""
This class handles the configuration for the text component of the Aria model. This class handles the configuration for the text component of the Aria model.
Instantiating a configuration with the defaults will yield a similar configuration to that of the model of the Aria Instantiating a configuration with the defaults will yield a similar configuration to that of the model of the Aria
@ -220,15 +220,15 @@ class AriaTextConfig(PretrainedConfig):
self.moe_num_shared_experts = moe_num_shared_experts self.moe_num_shared_experts = moe_num_shared_experts
class AriaConfig(PretrainedConfig): class AriaConfig(PreTrainedConfig):
r""" r"""
This class handles the configuration for both vision and text components of the Aria model, This class handles the configuration for both vision and text components of the Aria model,
as well as additional parameters for image token handling and projector mapping. as well as additional parameters for image token handling and projector mapping.
Instantiating a configuration with the defaults will yield a similar configuration to that of the model of the Aria Instantiating a configuration with the defaults will yield a similar configuration to that of the model of the Aria
[rhymes-ai/Aria](https://huggingface.co/rhymes-ai/Aria) architecture. [rhymes-ai/Aria](https://huggingface.co/rhymes-ai/Aria) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vision_config (`AriaVisionConfig` or `dict`, *optional*): vision_config (`AriaVisionConfig` or `dict`, *optional*):

View File

@ -21,7 +21,7 @@ from torch import nn
from ...activations import ACT2FN from ...activations import ACT2FN
from ...cache_utils import Cache from ...cache_utils import Cache
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...image_processing_utils import BaseImageProcessor, BatchFeature, get_patch_output_size, select_best_resolution from ...image_processing_utils import BaseImageProcessor, BatchFeature, get_patch_output_size, select_best_resolution
from ...image_transforms import PaddingMode, convert_to_rgb, pad, resize, to_channel_dimension_format from ...image_transforms import PaddingMode, convert_to_rgb, pad, resize, to_channel_dimension_format
from ...image_utils import ( from ...image_utils import (
@ -221,15 +221,15 @@ class AriaTextConfig(LlamaConfig):
self.moe_num_shared_experts = moe_num_shared_experts self.moe_num_shared_experts = moe_num_shared_experts
class AriaConfig(PretrainedConfig): class AriaConfig(PreTrainedConfig):
r""" r"""
This class handles the configuration for both vision and text components of the Aria model, This class handles the configuration for both vision and text components of the Aria model,
as well as additional parameters for image token handling and projector mapping. as well as additional parameters for image token handling and projector mapping.
Instantiating a configuration with the defaults will yield a similar configuration to that of the model of the Aria Instantiating a configuration with the defaults will yield a similar configuration to that of the model of the Aria
[rhymes-ai/Aria](https://huggingface.co/rhymes-ai/Aria) architecture. [rhymes-ai/Aria](https://huggingface.co/rhymes-ai/Aria) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vision_config (`AriaVisionConfig` or `dict`, *optional*): vision_config (`AriaVisionConfig` or `dict`, *optional*):

View File

@ -16,14 +16,14 @@
from typing import Any from typing import Any
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class ASTConfig(PretrainedConfig): class ASTConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`ASTModel`]. It is used to instantiate an AST This is the configuration class to store the configuration of a [`ASTModel`]. It is used to instantiate an AST
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
@ -31,8 +31,8 @@ class ASTConfig(PretrainedConfig):
[MIT/ast-finetuned-audioset-10-10-0.4593](https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593) [MIT/ast-finetuned-audioset-10-10-0.4593](https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593)
architecture. architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
hidden_size (`int`, *optional*, defaults to 768): hidden_size (`int`, *optional*, defaults to 768):

View File

@ -23,7 +23,7 @@ from collections import OrderedDict
from collections.abc import Iterator from collections.abc import Iterator
from typing import Any, TypeVar, Union from typing import Any, TypeVar, Union
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
from ...utils import ( from ...utils import (
CONFIG_NAME, CONFIG_NAME,
@ -65,7 +65,7 @@ FROM_CONFIG_DOCSTRING = """
model's configuration. Use [`~BaseAutoModelClass.from_pretrained`] to load the model weights. model's configuration. Use [`~BaseAutoModelClass.from_pretrained`] to load the model weights.
Args: Args:
config ([`PretrainedConfig`]): config ([`PreTrainedConfig`]):
The model class to instantiate is selected based on the configuration class: The model class to instantiate is selected based on the configuration class:
List options List options
@ -104,7 +104,7 @@ FROM_PRETRAINED_TORCH_DOCSTRING = """
[`~PreTrainedModel.save_pretrained`], e.g., `./my_model_directory/`. [`~PreTrainedModel.save_pretrained`], e.g., `./my_model_directory/`.
model_args (additional positional arguments, *optional*): model_args (additional positional arguments, *optional*):
Will be passed along to the underlying model `__init__()` method. Will be passed along to the underlying model `__init__()` method.
config ([`PretrainedConfig`], *optional*): config ([`PreTrainedConfig`], *optional*):
Configuration for the model to use instead of an automatically loaded configuration. Configuration can Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when: be automatically loaded when:
@ -155,7 +155,7 @@ FROM_PRETRAINED_TORCH_DOCSTRING = """
underlying model's `__init__` method (we assume all relevant updates to the configuration have underlying model's `__init__` method (we assume all relevant updates to the configuration have
already been done) already been done)
- If a configuration is not provided, `kwargs` will be first passed to the configuration class - If a configuration is not provided, `kwargs` will be first passed to the configuration class
initialization function ([`~PretrainedConfig.from_pretrained`]). Each key of `kwargs` that initialization function ([`~PreTrainedConfig.from_pretrained`]). Each key of `kwargs` that
corresponds to a configuration attribute will be used to override said attribute with the corresponds to a configuration attribute will be used to override said attribute with the
supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute
will be passed to the underlying model's `__init__` function. will be passed to the underlying model's `__init__` function.
@ -243,7 +243,7 @@ class _BaseAutoModelClass:
) )
@classmethod @classmethod
def _prepare_config_for_auto_class(cls, config: PretrainedConfig) -> PretrainedConfig: def _prepare_config_for_auto_class(cls, config: PreTrainedConfig) -> PreTrainedConfig:
"""Additional autoclass-specific config post-loading manipulation. May be overridden in subclasses.""" """Additional autoclass-specific config post-loading manipulation. May be overridden in subclasses."""
return config return config
@ -284,7 +284,7 @@ class _BaseAutoModelClass:
hub_kwargs["token"] = token hub_kwargs["token"] = token
if commit_hash is None: if commit_hash is None:
if not isinstance(config, PretrainedConfig): if not isinstance(config, PreTrainedConfig):
# We make a call to the config file first (which may be absent) to get the commit hash as soon as possible # We make a call to the config file first (which may be absent) to get the commit hash as soon as possible
resolved_config_file = cached_file( resolved_config_file = cached_file(
pretrained_model_name_or_path, pretrained_model_name_or_path,
@ -315,7 +315,7 @@ class _BaseAutoModelClass:
adapter_kwargs["_adapter_model_path"] = pretrained_model_name_or_path adapter_kwargs["_adapter_model_path"] = pretrained_model_name_or_path
pretrained_model_name_or_path = adapter_config["base_model_name_or_path"] pretrained_model_name_or_path = adapter_config["base_model_name_or_path"]
if not isinstance(config, PretrainedConfig): if not isinstance(config, PreTrainedConfig):
kwargs_orig = copy.deepcopy(kwargs) kwargs_orig = copy.deepcopy(kwargs)
# ensure not to pollute the config object with dtype="auto" - since it's # ensure not to pollute the config object with dtype="auto" - since it's
# meaningless in the context of the config object - torch.dtype values are acceptable # meaningless in the context of the config object - torch.dtype values are acceptable
@ -396,7 +396,7 @@ class _BaseAutoModelClass:
Register a new model for this class. Register a new model for this class.
Args: Args:
config_class ([`PretrainedConfig`]): config_class ([`PreTrainedConfig`]):
The configuration corresponding to the model to register. The configuration corresponding to the model to register.
model_class ([`PreTrainedModel`]): model_class ([`PreTrainedModel`]):
The model to register. The model to register.
@ -553,7 +553,7 @@ def add_generation_mixin_to_remote_model(model_class):
return model_class return model_class
class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue]): class _LazyAutoMapping(OrderedDict[type[PreTrainedConfig], _LazyAutoMappingValue]):
""" """
" A mapping config to object (model or tokenizer for instance) that will load keys and values when it is accessed. " A mapping config to object (model or tokenizer for instance) that will load keys and values when it is accessed.
@ -574,7 +574,7 @@ class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue
common_keys = set(self._config_mapping.keys()).intersection(self._model_mapping.keys()) common_keys = set(self._config_mapping.keys()).intersection(self._model_mapping.keys())
return len(common_keys) + len(self._extra_content) return len(common_keys) + len(self._extra_content)
def __getitem__(self, key: type[PretrainedConfig]) -> _LazyAutoMappingValue: def __getitem__(self, key: type[PreTrainedConfig]) -> _LazyAutoMappingValue:
if key in self._extra_content: if key in self._extra_content:
return self._extra_content[key] return self._extra_content[key]
model_type = self._reverse_config_mapping[key.__name__] model_type = self._reverse_config_mapping[key.__name__]
@ -596,7 +596,7 @@ class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue
self._modules[module_name] = importlib.import_module(f".{module_name}", "transformers.models") self._modules[module_name] = importlib.import_module(f".{module_name}", "transformers.models")
return getattribute_from_module(self._modules[module_name], attr) return getattribute_from_module(self._modules[module_name], attr)
def keys(self) -> list[type[PretrainedConfig]]: def keys(self) -> list[type[PreTrainedConfig]]:
mapping_keys = [ mapping_keys = [
self._load_attr_from_module(key, name) self._load_attr_from_module(key, name)
for key, name in self._config_mapping.items() for key, name in self._config_mapping.items()
@ -604,7 +604,7 @@ class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue
] ]
return mapping_keys + list(self._extra_content.keys()) return mapping_keys + list(self._extra_content.keys())
def get(self, key: type[PretrainedConfig], default: _T) -> Union[_LazyAutoMappingValue, _T]: def get(self, key: type[PreTrainedConfig], default: _T) -> Union[_LazyAutoMappingValue, _T]:
try: try:
return self.__getitem__(key) return self.__getitem__(key)
except KeyError: except KeyError:
@ -621,7 +621,7 @@ class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue
] ]
return mapping_values + list(self._extra_content.values()) return mapping_values + list(self._extra_content.values())
def items(self) -> list[tuple[type[PretrainedConfig], _LazyAutoMappingValue]]: def items(self) -> list[tuple[type[PreTrainedConfig], _LazyAutoMappingValue]]:
mapping_items = [ mapping_items = [
( (
self._load_attr_from_module(key, self._config_mapping[key]), self._load_attr_from_module(key, self._config_mapping[key]),
@ -632,7 +632,7 @@ class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue
] ]
return mapping_items + list(self._extra_content.items()) return mapping_items + list(self._extra_content.items())
def __iter__(self) -> Iterator[type[PretrainedConfig]]: def __iter__(self) -> Iterator[type[PreTrainedConfig]]:
return iter(self.keys()) return iter(self.keys())
def __contains__(self, item: type) -> bool: def __contains__(self, item: type) -> bool:
@ -643,7 +643,7 @@ class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue
model_type = self._reverse_config_mapping[item.__name__] model_type = self._reverse_config_mapping[item.__name__]
return model_type in self._model_mapping return model_type in self._model_mapping
def register(self, key: type[PretrainedConfig], value: _LazyAutoMappingValue, exist_ok=False) -> None: def register(self, key: type[PreTrainedConfig], value: _LazyAutoMappingValue, exist_ok=False) -> None:
""" """
Register a new model in this mapping. Register a new model in this mapping.
""" """

View File

@ -22,7 +22,7 @@ from collections import OrderedDict
from collections.abc import Callable, Iterator, KeysView, ValuesView from collections.abc import Callable, Iterator, KeysView, ValuesView
from typing import Any, TypeVar, Union from typing import Any, TypeVar, Union
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
from ...utils import CONFIG_NAME, logging from ...utils import CONFIG_NAME, logging
@ -1031,7 +1031,7 @@ def config_class_to_model_type(config) -> Union[str, None]:
return None return None
class _LazyConfigMapping(OrderedDict[str, type[PretrainedConfig]]): class _LazyConfigMapping(OrderedDict[str, type[PreTrainedConfig]]):
""" """
A dictionary that lazily load its values when they are requested. A dictionary that lazily load its values when they are requested.
""" """
@ -1041,7 +1041,7 @@ class _LazyConfigMapping(OrderedDict[str, type[PretrainedConfig]]):
self._extra_content = {} self._extra_content = {}
self._modules = {} self._modules = {}
def __getitem__(self, key: str) -> type[PretrainedConfig]: def __getitem__(self, key: str) -> type[PreTrainedConfig]:
if key in self._extra_content: if key in self._extra_content:
return self._extra_content[key] return self._extra_content[key]
if key not in self._mapping: if key not in self._mapping:
@ -1061,10 +1061,10 @@ class _LazyConfigMapping(OrderedDict[str, type[PretrainedConfig]]):
def keys(self) -> list[str]: def keys(self) -> list[str]:
return list(self._mapping.keys()) + list(self._extra_content.keys()) return list(self._mapping.keys()) + list(self._extra_content.keys())
def values(self) -> list[type[PretrainedConfig]]: def values(self) -> list[type[PreTrainedConfig]]:
return [self[k] for k in self._mapping] + list(self._extra_content.values()) return [self[k] for k in self._mapping] + list(self._extra_content.values())
def items(self) -> list[tuple[str, type[PretrainedConfig]]]: def items(self) -> list[tuple[str, type[PreTrainedConfig]]]:
return [(k, self[k]) for k in self._mapping] + list(self._extra_content.items()) return [(k, self[k]) for k in self._mapping] + list(self._extra_content.items())
def __iter__(self) -> Iterator[str]: def __iter__(self) -> Iterator[str]:
@ -1073,7 +1073,7 @@ class _LazyConfigMapping(OrderedDict[str, type[PretrainedConfig]]):
def __contains__(self, item: object) -> bool: def __contains__(self, item: object) -> bool:
return item in self._mapping or item in self._extra_content return item in self._mapping or item in self._extra_content
def register(self, key: str, value: type[PretrainedConfig], exist_ok=False) -> None: def register(self, key: str, value: type[PreTrainedConfig], exist_ok=False) -> None:
""" """
Register a new configuration in this mapping. Register a new configuration in this mapping.
""" """
@ -1219,7 +1219,7 @@ class AutoConfig:
) )
@classmethod @classmethod
def for_model(cls, model_type: str, *args, **kwargs) -> PretrainedConfig: def for_model(cls, model_type: str, *args, **kwargs) -> PreTrainedConfig:
if model_type in CONFIG_MAPPING: if model_type in CONFIG_MAPPING:
config_class = CONFIG_MAPPING[model_type] config_class = CONFIG_MAPPING[model_type]
return config_class(*args, **kwargs) return config_class(*args, **kwargs)
@ -1245,7 +1245,7 @@ class AutoConfig:
- A string, the *model id* of a pretrained model configuration hosted inside a model repo on - A string, the *model id* of a pretrained model configuration hosted inside a model repo on
huggingface.co. huggingface.co.
- A path to a *directory* containing a configuration file saved using the - A path to a *directory* containing a configuration file saved using the
[`~PretrainedConfig.save_pretrained`] method, or the [`~PreTrainedModel.save_pretrained`] method, [`~PreTrainedConfig.save_pretrained`] method, or the [`~PreTrainedModel.save_pretrained`] method,
e.g., `./my_model_directory/`. e.g., `./my_model_directory/`.
- A path or url to a saved configuration JSON *file*, e.g., - A path or url to a saved configuration JSON *file*, e.g.,
`./my_model_directory/configuration.json`. `./my_model_directory/configuration.json`.
@ -1326,7 +1326,7 @@ class AutoConfig:
trust_remote_code = kwargs.pop("trust_remote_code", None) trust_remote_code = kwargs.pop("trust_remote_code", None)
code_revision = kwargs.pop("code_revision", None) code_revision = kwargs.pop("code_revision", None)
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs) config_dict, unused_kwargs = PreTrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
has_remote_code = "auto_map" in config_dict and "AutoConfig" in config_dict["auto_map"] has_remote_code = "auto_map" in config_dict and "AutoConfig" in config_dict["auto_map"]
has_local_code = "model_type" in config_dict and config_dict["model_type"] in CONFIG_MAPPING has_local_code = "model_type" in config_dict and config_dict["model_type"] in CONFIG_MAPPING
if has_remote_code: if has_remote_code:
@ -1387,9 +1387,9 @@ class AutoConfig:
Args: Args:
model_type (`str`): The model type like "bert" or "gpt". model_type (`str`): The model type like "bert" or "gpt".
config ([`PretrainedConfig`]): The config to register. config ([`PreTrainedConfig`]): The config to register.
""" """
if issubclass(config, PretrainedConfig) and config.model_type != model_type: if issubclass(config, PreTrainedConfig) and config.model_type != model_type:
raise ValueError( raise ValueError(
"The config you are passing has a `model_type` attribute that is not consistent with the model type " "The config you are passing has a `model_type` attribute that is not consistent with the model type "
f"you passed (config has {config.model_type} and you passed {model_type}. Fix one of those so they " f"you passed (config has {config.model_type} and you passed {model_type}. Fix one of those so they "

View File

@ -22,7 +22,7 @@ from collections import OrderedDict
from typing import Optional, Union from typing import Optional, Union
# Build the list of all feature extractors # Build the list of all feature extractors
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
from ...feature_extraction_utils import FeatureExtractionMixin from ...feature_extraction_utils import FeatureExtractionMixin
from ...utils import CONFIG_NAME, FEATURE_EXTRACTOR_NAME, cached_file, logging from ...utils import CONFIG_NAME, FEATURE_EXTRACTOR_NAME, cached_file, logging
@ -309,7 +309,7 @@ class AutoFeatureExtractor:
# If we don't find the feature extractor class in the feature extractor config, let's try the model config. # If we don't find the feature extractor class in the feature extractor config, let's try the model config.
if feature_extractor_class is None and feature_extractor_auto_map is None: if feature_extractor_class is None and feature_extractor_auto_map is None:
if not isinstance(config, PretrainedConfig): if not isinstance(config, PreTrainedConfig):
config = AutoConfig.from_pretrained( config = AutoConfig.from_pretrained(
pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
) )
@ -358,7 +358,7 @@ class AutoFeatureExtractor:
Register a new feature extractor for this class. Register a new feature extractor for this class.
Args: Args:
config_class ([`PretrainedConfig`]): config_class ([`PreTrainedConfig`]):
The configuration corresponding to the model to register. The configuration corresponding to the model to register.
feature_extractor_class ([`FeatureExtractorMixin`]): The feature extractor to register. feature_extractor_class ([`FeatureExtractorMixin`]): The feature extractor to register.
""" """

View File

@ -22,7 +22,7 @@ from collections import OrderedDict
from typing import TYPE_CHECKING, Optional, Union from typing import TYPE_CHECKING, Optional, Union
# Build the list of all image processors # Build the list of all image processors
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
from ...image_processing_utils import ImageProcessingMixin from ...image_processing_utils import ImageProcessingMixin
from ...image_processing_utils_fast import BaseImageProcessorFast from ...image_processing_utils_fast import BaseImageProcessorFast
@ -502,7 +502,7 @@ class AutoImageProcessor:
# If we don't find the image processor class in the image processor config, let's try the model config. # If we don't find the image processor class in the image processor config, let's try the model config.
if image_processor_type is None and image_processor_auto_map is None: if image_processor_type is None and image_processor_auto_map is None:
if not isinstance(config, PretrainedConfig): if not isinstance(config, PreTrainedConfig):
config = AutoConfig.from_pretrained( config = AutoConfig.from_pretrained(
pretrained_model_name_or_path, pretrained_model_name_or_path,
trust_remote_code=trust_remote_code, trust_remote_code=trust_remote_code,
@ -629,7 +629,7 @@ class AutoImageProcessor:
Register a new image processor for this class. Register a new image processor for this class.
Args: Args:
config_class ([`PretrainedConfig`]): config_class ([`PreTrainedConfig`]):
The configuration corresponding to the model to register. The configuration corresponding to the model to register.
image_processor_class ([`ImageProcessingMixin`]): The image processor to register. image_processor_class ([`ImageProcessingMixin`]): The image processor to register.
""" """

View File

@ -21,7 +21,7 @@ import warnings
from collections import OrderedDict from collections import OrderedDict
# Build the list of all feature extractors # Build the list of all feature extractors
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
from ...feature_extraction_utils import FeatureExtractionMixin from ...feature_extraction_utils import FeatureExtractionMixin
from ...image_processing_utils import ImageProcessingMixin from ...image_processing_utils import ImageProcessingMixin
@ -356,7 +356,7 @@ class AutoProcessor:
if processor_class is None: if processor_class is None:
# Otherwise, load config, if it can be loaded. # Otherwise, load config, if it can be loaded.
if not isinstance(config, PretrainedConfig): if not isinstance(config, PreTrainedConfig):
config = AutoConfig.from_pretrained( config = AutoConfig.from_pretrained(
pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
) )
@ -430,7 +430,7 @@ class AutoProcessor:
Register a new processor for this class. Register a new processor for this class.
Args: Args:
config_class ([`PretrainedConfig`]): config_class ([`PreTrainedConfig`]):
The configuration corresponding to the model to register. The configuration corresponding to the model to register.
processor_class ([`ProcessorMixin`]): The processor to register. processor_class ([`ProcessorMixin`]): The processor to register.
""" """

View File

@ -23,7 +23,7 @@ from typing import Any, Optional, Union
from transformers.utils.import_utils import is_mistral_common_available from transformers.utils.import_utils import is_mistral_common_available
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
from ...modeling_gguf_pytorch_utils import load_gguf_checkpoint from ...modeling_gguf_pytorch_utils import load_gguf_checkpoint
from ...tokenization_utils import PreTrainedTokenizer from ...tokenization_utils import PreTrainedTokenizer
@ -962,7 +962,7 @@ class AutoTokenizer:
applicable to all derived classes) applicable to all derived classes)
inputs (additional positional arguments, *optional*): inputs (additional positional arguments, *optional*):
Will be passed along to the Tokenizer `__init__()` method. Will be passed along to the Tokenizer `__init__()` method.
config ([`PretrainedConfig`], *optional*) config ([`PreTrainedConfig`], *optional*)
The configuration object used to determine the tokenizer class to instantiate. The configuration object used to determine the tokenizer class to instantiate.
cache_dir (`str` or `os.PathLike`, *optional*): cache_dir (`str` or `os.PathLike`, *optional*):
Path to a directory in which a downloaded pretrained model configuration should be cached if the Path to a directory in which a downloaded pretrained model configuration should be cached if the
@ -1076,7 +1076,7 @@ class AutoTokenizer:
# If that did not work, let's try to use the config. # If that did not work, let's try to use the config.
if config_tokenizer_class is None: if config_tokenizer_class is None:
if not isinstance(config, PretrainedConfig): if not isinstance(config, PreTrainedConfig):
if gguf_file: if gguf_file:
gguf_path = cached_file(pretrained_model_name_or_path, gguf_file, **kwargs) gguf_path = cached_file(pretrained_model_name_or_path, gguf_file, **kwargs)
config_dict = load_gguf_checkpoint(gguf_path, return_tensors=False)["config"] config_dict = load_gguf_checkpoint(gguf_path, return_tensors=False)["config"]
@ -1170,7 +1170,7 @@ class AutoTokenizer:
Args: Args:
config_class ([`PretrainedConfig`]): config_class ([`PreTrainedConfig`]):
The configuration corresponding to the model to register. The configuration corresponding to the model to register.
slow_tokenizer_class ([`PretrainedTokenizer`], *optional*): slow_tokenizer_class ([`PretrainedTokenizer`], *optional*):
The slow tokenizer to register. The slow tokenizer to register.

View File

@ -22,7 +22,7 @@ from collections import OrderedDict
from typing import TYPE_CHECKING, Optional, Union from typing import TYPE_CHECKING, Optional, Union
# Build the list of all video processors # Build the list of all video processors
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
from ...utils import CONFIG_NAME, VIDEO_PROCESSOR_NAME, cached_file, is_torchvision_available, logging from ...utils import CONFIG_NAME, VIDEO_PROCESSOR_NAME, cached_file, is_torchvision_available, logging
from ...utils.import_utils import requires from ...utils.import_utils import requires
@ -321,7 +321,7 @@ class AutoVideoProcessor:
# If we don't find the video processor class in the video processor config, let's try the model config. # If we don't find the video processor class in the video processor config, let's try the model config.
if video_processor_class is None and video_processor_auto_map is None: if video_processor_class is None and video_processor_auto_map is None:
if not isinstance(config, PretrainedConfig): if not isinstance(config, PreTrainedConfig):
config = AutoConfig.from_pretrained( config = AutoConfig.from_pretrained(
pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
) )
@ -374,7 +374,7 @@ class AutoVideoProcessor:
Register a new video processor for this class. Register a new video processor for this class.
Args: Args:
config_class ([`PretrainedConfig`]): config_class ([`PreTrainedConfig`]):
The configuration corresponding to the model to register. The configuration corresponding to the model to register.
video_processor_class ([`BaseVideoProcessor`]): video_processor_class ([`BaseVideoProcessor`]):
The video processor to register. The video processor to register.

View File

@ -16,14 +16,14 @@
from typing import Optional from typing import Optional
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class AutoformerConfig(PretrainedConfig): class AutoformerConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of an [`AutoformerModel`]. It is used to instantiate an This is the configuration class to store the configuration of an [`AutoformerModel`]. It is used to instantiate an
Autoformer model according to the specified arguments, defining the model architecture. Instantiating a Autoformer model according to the specified arguments, defining the model architecture. Instantiating a
@ -31,8 +31,8 @@ class AutoformerConfig(PretrainedConfig):
[huggingface/autoformer-tourism-monthly](https://huggingface.co/huggingface/autoformer-tourism-monthly) [huggingface/autoformer-tourism-monthly](https://huggingface.co/huggingface/autoformer-tourism-monthly)
architecture. architecture.
Configuration objects inherit from [`PretrainedConfig`] can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
prediction_length (`int`): prediction_length (`int`):

View File

@ -14,7 +14,7 @@
# limitations under the License. # limitations under the License.
"""AyaVision model configuration""" """AyaVision model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
from ..auto import CONFIG_MAPPING, AutoConfig from ..auto import CONFIG_MAPPING, AutoConfig
@ -22,15 +22,15 @@ from ..auto import CONFIG_MAPPING, AutoConfig
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class AyaVisionConfig(PretrainedConfig): class AyaVisionConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`AyaVisionForConditionalGeneration`]. It is used to instantiate an This is the configuration class to store the configuration of a [`AyaVisionForConditionalGeneration`]. It is used to instantiate an
AyaVision model according to the specified arguments, defining the model architecture. Instantiating a configuration AyaVision model according to the specified arguments, defining the model architecture. Instantiating a configuration
with the defaults will yield a similar configuration to that of AyaVision. with the defaults will yield a similar configuration to that of AyaVision.
e.g. [CohereForAI/aya-vision-8b](https://huggingface.co/CohereForAI/aya-vision-8b) e.g. [CohereForAI/aya-vision-8b](https://huggingface.co/CohereForAI/aya-vision-8b)
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vision_config (`Union[AutoConfig, dict]`, *optional*, defaults to `SiglipVisionConfig`): vision_config (`Union[AutoConfig, dict]`, *optional*, defaults to `SiglipVisionConfig`):

View File

@ -14,14 +14,14 @@
# limitations under the License. # limitations under the License.
"""Bamba model configuration""" """Bamba model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BambaConfig(PretrainedConfig): class BambaConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BambaModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`BambaModel`]. It is used to instantiate a
BambaModel model according to the specified arguments, defining the model architecture. Instantiating a configuration BambaModel model according to the specified arguments, defining the model architecture. Instantiating a configuration
@ -30,8 +30,8 @@ class BambaConfig(PretrainedConfig):
The BambaModel is a hybrid [mamba2](https://github.com/state-spaces/mamba) architecture with SwiGLU. The BambaModel is a hybrid [mamba2](https://github.com/state-spaces/mamba) architecture with SwiGLU.
The checkpoints are jointly trained by IBM, Princeton, and UIUC. The checkpoints are jointly trained by IBM, Princeton, and UIUC.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 128000): vocab_size (`int`, *optional*, defaults to 128000):

View File

@ -16,7 +16,7 @@
from typing import Optional from typing import Optional
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import add_start_docstrings, logging from ...utils import add_start_docstrings, logging
from ..auto import CONFIG_MAPPING, AutoConfig from ..auto import CONFIG_MAPPING, AutoConfig
@ -30,8 +30,8 @@ BARK_SUBMODELCONFIG_START_DOCSTRING = """
defaults will yield a similar configuration to that of the Bark [suno/bark](https://huggingface.co/suno/bark) defaults will yield a similar configuration to that of the Bark [suno/bark](https://huggingface.co/suno/bark)
architecture. architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
block_size (`int`, *optional*, defaults to 1024): block_size (`int`, *optional*, defaults to 1024):
@ -62,7 +62,7 @@ BARK_SUBMODELCONFIG_START_DOCSTRING = """
""" """
class BarkSubModelConfig(PretrainedConfig): class BarkSubModelConfig(PreTrainedConfig):
keys_to_ignore_at_inference = ["past_key_values"] keys_to_ignore_at_inference = ["past_key_values"]
attribute_map = { attribute_map = {
@ -180,7 +180,7 @@ class BarkFineConfig(BarkSubModelConfig):
super().__init__(tie_word_embeddings=tie_word_embeddings, **kwargs) super().__init__(tie_word_embeddings=tie_word_embeddings, **kwargs)
class BarkConfig(PretrainedConfig): class BarkConfig(PreTrainedConfig):
""" """
This is the configuration class to store the configuration of a [`BarkModel`]. It is used to instantiate a Bark This is the configuration class to store the configuration of a [`BarkModel`]. It is used to instantiate a Bark
model according to the specified sub-models configurations, defining the model architecture. model according to the specified sub-models configurations, defining the model architecture.
@ -188,8 +188,8 @@ class BarkConfig(PretrainedConfig):
Instantiating a configuration with the defaults will yield a similar configuration to that of the Bark Instantiating a configuration with the defaults will yield a similar configuration to that of the Bark
[suno/bark](https://huggingface.co/suno/bark) architecture. [suno/bark](https://huggingface.co/suno/bark) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
semantic_config ([`BarkSemanticConfig`], *optional*): semantic_config ([`BarkSemanticConfig`], *optional*):
@ -282,7 +282,7 @@ class BarkConfig(PretrainedConfig):
semantic_config: BarkSemanticConfig, semantic_config: BarkSemanticConfig,
coarse_acoustics_config: BarkCoarseConfig, coarse_acoustics_config: BarkCoarseConfig,
fine_acoustics_config: BarkFineConfig, fine_acoustics_config: BarkFineConfig,
codec_config: PretrainedConfig, codec_config: PreTrainedConfig,
**kwargs, **kwargs,
): ):
r""" r"""

View File

@ -315,7 +315,7 @@ class BarkGenerationConfig(GenerationConfig):
def to_dict(self): def to_dict(self):
""" """
Serializes this instance to a Python dictionary. Override the default [`~PretrainedConfig.to_dict`]. Serializes this instance to a Python dictionary. Override the default [`~PreTrainedConfig.to_dict`].
Returns: Returns:
`dict[str, any]`: Dictionary of all the attributes that make up this configuration instance, `dict[str, any]`: Dictionary of all the attributes that make up this configuration instance,

View File

@ -20,7 +20,7 @@ from collections.abc import Mapping
from typing import Any from typing import Any
from ... import PreTrainedTokenizer from ... import PreTrainedTokenizer
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
from ...onnx.utils import compute_effective_axis_dimension from ...onnx.utils import compute_effective_axis_dimension
from ...utils import is_torch_available, logging from ...utils import is_torch_available, logging
@ -29,15 +29,15 @@ from ...utils import is_torch_available, logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BartConfig(PretrainedConfig): class BartConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BartModel`]. It is used to instantiate a BART This is the configuration class to store the configuration of a [`BartModel`]. It is used to instantiate a BART
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
defaults will yield a similar configuration to that of the BART defaults will yield a similar configuration to that of the BART
[facebook/bart-large](https://huggingface.co/facebook/bart-large) architecture. [facebook/bart-large](https://huggingface.co/facebook/bart-large) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -20,12 +20,12 @@ from collections.abc import Mapping
from packaging import version from packaging import version
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig from ...onnx import OnnxConfig
from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_features_output_indices from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_features_output_indices
class BeitConfig(BackboneConfigMixin, PretrainedConfig): class BeitConfig(BackboneConfigMixin, PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BeitModel`]. It is used to instantiate an BEiT This is the configuration class to store the configuration of a [`BeitModel`]. It is used to instantiate an BEiT
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the

View File

@ -18,7 +18,7 @@
from collections import OrderedDict from collections import OrderedDict
from collections.abc import Mapping from collections.abc import Mapping
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig from ...onnx import OnnxConfig
from ...utils import logging from ...utils import logging
@ -26,15 +26,15 @@ from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BertConfig(PretrainedConfig): class BertConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BertModel`] or a [`TFBertModel`]. It is used to This is the configuration class to store the configuration of a [`BertModel`] or a [`TFBertModel`]. It is used to
instantiate a BERT model according to the specified arguments, defining the model architecture. Instantiating a instantiate a BERT model according to the specified arguments, defining the model architecture. Instantiating a
configuration with the defaults will yield a similar configuration to that of the BERT configuration with the defaults will yield a similar configuration to that of the BERT
[google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) architecture. [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -14,10 +14,10 @@
# limitations under the License. # limitations under the License.
"""BertGeneration model configuration""" """BertGeneration model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
class BertGenerationConfig(PretrainedConfig): class BertGenerationConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BertGenerationPreTrainedModel`]. It is used to This is the configuration class to store the configuration of a [`BertGenerationPreTrainedModel`]. It is used to
instantiate a BertGeneration model according to the specified arguments, defining the model architecture. instantiate a BertGeneration model according to the specified arguments, defining the model architecture.
@ -25,8 +25,8 @@ class BertGenerationConfig(PretrainedConfig):
[google/bert_for_seq_generation_L-24_bbc_encoder](https://huggingface.co/google/bert_for_seq_generation_L-24_bbc_encoder) [google/bert_for_seq_generation_L-24_bbc_encoder](https://huggingface.co/google/bert_for_seq_generation_L-24_bbc_encoder)
architecture. architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 50358): vocab_size (`int`, *optional*, defaults to 50358):

View File

@ -17,7 +17,7 @@
from collections import OrderedDict from collections import OrderedDict
from collections.abc import Mapping from collections.abc import Mapping
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig from ...onnx import OnnxConfig
from ...utils import logging from ...utils import logging
@ -25,15 +25,15 @@ from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BigBirdConfig(PretrainedConfig): class BigBirdConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BigBirdModel`]. It is used to instantiate an This is the configuration class to store the configuration of a [`BigBirdModel`]. It is used to instantiate an
BigBird model according to the specified arguments, defining the model architecture. Instantiating a configuration BigBird model according to the specified arguments, defining the model architecture. Instantiating a configuration
with the defaults will yield a similar configuration to that of the BigBird with the defaults will yield a similar configuration to that of the BigBird
[google/bigbird-roberta-base](https://huggingface.co/google/bigbird-roberta-base) architecture. [google/bigbird-roberta-base](https://huggingface.co/google/bigbird-roberta-base) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -19,7 +19,7 @@ from collections.abc import Mapping
from typing import Any from typing import Any
from ... import PreTrainedTokenizer from ... import PreTrainedTokenizer
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
from ...onnx.utils import compute_effective_axis_dimension from ...onnx.utils import compute_effective_axis_dimension
from ...utils import is_torch_available, logging from ...utils import is_torch_available, logging
@ -28,15 +28,15 @@ from ...utils import is_torch_available, logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BigBirdPegasusConfig(PretrainedConfig): class BigBirdPegasusConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BigBirdPegasusModel`]. It is used to instantiate This is the configuration class to store the configuration of a [`BigBirdPegasusModel`]. It is used to instantiate
an BigBirdPegasus model according to the specified arguments, defining the model architecture. Instantiating a an BigBirdPegasus model according to the specified arguments, defining the model architecture. Instantiating a
configuration with the defaults will yield a similar configuration to that of the BigBirdPegasus configuration with the defaults will yield a similar configuration to that of the BigBirdPegasus
[google/bigbird-pegasus-large-arxiv](https://huggingface.co/google/bigbird-pegasus-large-arxiv) architecture. [google/bigbird-pegasus-large-arxiv](https://huggingface.co/google/bigbird-pegasus-large-arxiv) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -14,22 +14,22 @@
# limitations under the License. # limitations under the License.
"""BioGPT model configuration""" """BioGPT model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BioGptConfig(PretrainedConfig): class BioGptConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BioGptModel`]. It is used to instantiate an This is the configuration class to store the configuration of a [`BioGptModel`]. It is used to instantiate an
BioGPT model according to the specified arguments, defining the model architecture. Instantiating a configuration BioGPT model according to the specified arguments, defining the model architecture. Instantiating a configuration
with the defaults will yield a similar configuration to that of the BioGPT with the defaults will yield a similar configuration to that of the BioGPT
[microsoft/biogpt](https://huggingface.co/microsoft/biogpt) architecture. [microsoft/biogpt](https://huggingface.co/microsoft/biogpt) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -14,7 +14,7 @@
# limitations under the License. # limitations under the License.
"""BiT model configuration""" """BiT model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_features_output_indices from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_features_output_indices
@ -22,15 +22,15 @@ from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_feat
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BitConfig(BackboneConfigMixin, PretrainedConfig): class BitConfig(BackboneConfigMixin, PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BitModel`]. It is used to instantiate an BiT This is the configuration class to store the configuration of a [`BitModel`]. It is used to instantiate an BiT
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
defaults will yield a similar configuration to that of the BiT defaults will yield a similar configuration to that of the BiT
[google/bit-50](https://huggingface.co/google/bit-50) architecture. [google/bit-50](https://huggingface.co/google/bit-50) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
num_channels (`int`, *optional*, defaults to 3): num_channels (`int`, *optional*, defaults to 3):

View File

@ -13,22 +13,22 @@
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
"""BitNet model configuration""" """BitNet model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BitNetConfig(PretrainedConfig): class BitNetConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BitNetModel`]. It is used to instantiate an BitNet This is the configuration class to store the configuration of a [`BitNetModel`]. It is used to instantiate an BitNet
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
defaults will yield a similar configuration to that of defaults will yield a similar configuration to that of
BitNet b1.58 2B4T [microsoft/bitnet-b1.58-2B-4T](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T). BitNet b1.58 2B4T [microsoft/bitnet-b1.58-2B-4T](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T).
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -19,7 +19,7 @@ from collections.abc import Mapping
from typing import Any from typing import Any
from ... import PreTrainedTokenizer from ... import PreTrainedTokenizer
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...file_utils import is_torch_available from ...file_utils import is_torch_available
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
from ...onnx.utils import compute_effective_axis_dimension from ...onnx.utils import compute_effective_axis_dimension
@ -29,15 +29,15 @@ from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BlenderbotConfig(PretrainedConfig): class BlenderbotConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BlenderbotModel`]. It is used to instantiate an This is the configuration class to store the configuration of a [`BlenderbotModel`]. It is used to instantiate an
Blenderbot model according to the specified arguments, defining the model architecture. Instantiating a Blenderbot model according to the specified arguments, defining the model architecture. Instantiating a
configuration with the defaults will yield a similar configuration to that of the Blenderbot configuration with the defaults will yield a similar configuration to that of the Blenderbot
[facebook/blenderbot-3B](https://huggingface.co/facebook/blenderbot-3B) architecture. [facebook/blenderbot-3B](https://huggingface.co/facebook/blenderbot-3B) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -19,7 +19,7 @@ from collections.abc import Mapping
from typing import Any from typing import Any
from ... import PreTrainedTokenizer from ... import PreTrainedTokenizer
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...file_utils import is_torch_available from ...file_utils import is_torch_available
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
from ...onnx.utils import compute_effective_axis_dimension from ...onnx.utils import compute_effective_axis_dimension
@ -29,15 +29,15 @@ from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BlenderbotSmallConfig(PretrainedConfig): class BlenderbotSmallConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BlenderbotSmallModel`]. It is used to instantiate This is the configuration class to store the configuration of a [`BlenderbotSmallModel`]. It is used to instantiate
an BlenderbotSmall model according to the specified arguments, defining the model architecture. Instantiating a an BlenderbotSmall model according to the specified arguments, defining the model architecture. Instantiating a
configuration with the defaults will yield a similar configuration to that of the BlenderbotSmall configuration with the defaults will yield a similar configuration to that of the BlenderbotSmall
[facebook/blenderbot_small-90M](https://huggingface.co/facebook/blenderbot_small-90M) architecture. [facebook/blenderbot_small-90M](https://huggingface.co/facebook/blenderbot_small-90M) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -14,22 +14,22 @@
# limitations under the License. # limitations under the License.
"""Blip model configuration""" """Blip model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BlipTextConfig(PretrainedConfig): class BlipTextConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BlipTextModel`]. It is used to instantiate a BLIP This is the configuration class to store the configuration of a [`BlipTextModel`]. It is used to instantiate a BLIP
text model according to the specified arguments, defining the model architecture. Instantiating a configuration text model according to the specified arguments, defining the model architecture. Instantiating a configuration
with the defaults will yield a similar configuration to that of the `BlipText` used by the [base with the defaults will yield a similar configuration to that of the `BlipText` used by the [base
architectures](https://huggingface.co/Salesforce/blip-vqa-base). architectures](https://huggingface.co/Salesforce/blip-vqa-base).
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
@ -145,15 +145,15 @@ class BlipTextConfig(PretrainedConfig):
self.label_smoothing = label_smoothing self.label_smoothing = label_smoothing
class BlipVisionConfig(PretrainedConfig): class BlipVisionConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BlipVisionModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`BlipVisionModel`]. It is used to instantiate a
BLIP vision model according to the specified arguments, defining the model architecture. Instantiating a BLIP vision model according to the specified arguments, defining the model architecture. Instantiating a
configuration defaults will yield a similar configuration to that of the Blip-base configuration defaults will yield a similar configuration to that of the Blip-base
[Salesforce/blip-vqa-base](https://huggingface.co/Salesforce/blip-vqa-base) architecture. [Salesforce/blip-vqa-base](https://huggingface.co/Salesforce/blip-vqa-base) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
@ -227,15 +227,15 @@ class BlipVisionConfig(PretrainedConfig):
self.hidden_act = hidden_act self.hidden_act = hidden_act
class BlipConfig(PretrainedConfig): class BlipConfig(PreTrainedConfig):
r""" r"""
[`BlipConfig`] is the configuration class to store the configuration of a [`BlipModel`]. It is used to instantiate [`BlipConfig`] is the configuration class to store the configuration of a [`BlipModel`]. It is used to instantiate
a BLIP model according to the specified arguments, defining the text model and vision model configs. Instantiating a BLIP model according to the specified arguments, defining the text model and vision model configs. Instantiating
a configuration with the defaults will yield a similar configuration to that of the BLIP-base a configuration with the defaults will yield a similar configuration to that of the BLIP-base
[Salesforce/blip-vqa-base](https://huggingface.co/Salesforce/blip-vqa-base) architecture. [Salesforce/blip-vqa-base](https://huggingface.co/Salesforce/blip-vqa-base) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
text_config (`dict`, *optional*): text_config (`dict`, *optional*):

View File

@ -16,7 +16,7 @@
from typing import Optional from typing import Optional
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...models.auto.modeling_auto import MODEL_FOR_CAUSAL_LM_MAPPING_NAMES from ...models.auto.modeling_auto import MODEL_FOR_CAUSAL_LM_MAPPING_NAMES
from ...utils import logging from ...utils import logging
from ..auto import CONFIG_MAPPING, AutoConfig from ..auto import CONFIG_MAPPING, AutoConfig
@ -25,15 +25,15 @@ from ..auto import CONFIG_MAPPING, AutoConfig
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class Blip2VisionConfig(PretrainedConfig): class Blip2VisionConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`Blip2VisionModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`Blip2VisionModel`]. It is used to instantiate a
BLIP-2 vision encoder according to the specified arguments, defining the model architecture. Instantiating a BLIP-2 vision encoder according to the specified arguments, defining the model architecture. Instantiating a
configuration defaults will yield a similar configuration to that of the BLIP-2 configuration defaults will yield a similar configuration to that of the BLIP-2
[Salesforce/blip2-opt-2.7b](https://huggingface.co/Salesforce/blip2-opt-2.7b) architecture. [Salesforce/blip2-opt-2.7b](https://huggingface.co/Salesforce/blip2-opt-2.7b) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
hidden_size (`int`, *optional*, defaults to 1408): hidden_size (`int`, *optional*, defaults to 1408):
@ -107,14 +107,14 @@ class Blip2VisionConfig(PretrainedConfig):
self.qkv_bias = qkv_bias self.qkv_bias = qkv_bias
class Blip2QFormerConfig(PretrainedConfig): class Blip2QFormerConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`Blip2QFormerModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`Blip2QFormerModel`]. It is used to instantiate a
BLIP-2 Querying Transformer (Q-Former) model according to the specified arguments, defining the model architecture. BLIP-2 Querying Transformer (Q-Former) model according to the specified arguments, defining the model architecture.
Instantiating a configuration with the defaults will yield a similar configuration to that of the BLIP-2 Instantiating a configuration with the defaults will yield a similar configuration to that of the BLIP-2
[Salesforce/blip2-opt-2.7b](https://huggingface.co/Salesforce/blip2-opt-2.7b) architecture. Configuration objects [Salesforce/blip2-opt-2.7b](https://huggingface.co/Salesforce/blip2-opt-2.7b) architecture. Configuration objects
inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the documentation from inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the documentation from
[`PretrainedConfig`] for more information. [`PreTrainedConfig`] for more information.
Note that [`Blip2QFormerModel`] is very similar to [`BertLMHeadModel`] with interleaved cross-attention. Note that [`Blip2QFormerModel`] is very similar to [`BertLMHeadModel`] with interleaved cross-attention.
@ -215,15 +215,15 @@ class Blip2QFormerConfig(PretrainedConfig):
self.use_qformer_text_input = use_qformer_text_input self.use_qformer_text_input = use_qformer_text_input
class Blip2Config(PretrainedConfig): class Blip2Config(PreTrainedConfig):
r""" r"""
[`Blip2Config`] is the configuration class to store the configuration of a [`Blip2ForConditionalGeneration`]. It is [`Blip2Config`] is the configuration class to store the configuration of a [`Blip2ForConditionalGeneration`]. It is
used to instantiate a BLIP-2 model according to the specified arguments, defining the vision model, Q-Former model used to instantiate a BLIP-2 model according to the specified arguments, defining the vision model, Q-Former model
and language model configs. Instantiating a configuration with the defaults will yield a similar configuration to and language model configs. Instantiating a configuration with the defaults will yield a similar configuration to
that of the BLIP-2 [Salesforce/blip2-opt-2.7b](https://huggingface.co/Salesforce/blip2-opt-2.7b) architecture. that of the BLIP-2 [Salesforce/blip2-opt-2.7b](https://huggingface.co/Salesforce/blip2-opt-2.7b) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vision_config (`dict`, *optional*): vision_config (`dict`, *optional*):
@ -231,7 +231,7 @@ class Blip2Config(PretrainedConfig):
qformer_config (`dict`, *optional*): qformer_config (`dict`, *optional*):
Dictionary of configuration options used to initialize [`Blip2QFormerConfig`]. Dictionary of configuration options used to initialize [`Blip2QFormerConfig`].
text_config (`dict`, *optional*): text_config (`dict`, *optional*):
Dictionary of configuration options used to initialize any [`PretrainedConfig`]. Dictionary of configuration options used to initialize any [`PreTrainedConfig`].
num_query_tokens (`int`, *optional*, defaults to 32): num_query_tokens (`int`, *optional*, defaults to 32):
The number of query tokens passed through the Transformer. The number of query tokens passed through the Transformer.
image_text_hidden_size (`int`, *optional*, defaults to 256): image_text_hidden_size (`int`, *optional*, defaults to 256):
@ -262,7 +262,7 @@ class Blip2Config(PretrainedConfig):
>>> # Accessing the model configuration >>> # Accessing the model configuration
>>> configuration = model.config >>> configuration = model.config
>>> # We can also initialize a Blip2Config from a Blip2VisionConfig, Blip2QFormerConfig and any PretrainedConfig >>> # We can also initialize a Blip2Config from a Blip2VisionConfig, Blip2QFormerConfig and any PreTrainedConfig
>>> # Initializing BLIP-2 vision, BLIP-2 Q-Former and language model configurations >>> # Initializing BLIP-2 vision, BLIP-2 Q-Former and language model configurations
>>> vision_config = Blip2VisionConfig() >>> vision_config = Blip2VisionConfig()
@ -321,7 +321,7 @@ class Blip2Config(PretrainedConfig):
cls, cls,
vision_config: Blip2VisionConfig, vision_config: Blip2VisionConfig,
qformer_config: Blip2QFormerConfig, qformer_config: Blip2QFormerConfig,
text_config: Optional[PretrainedConfig] = None, text_config: Optional[PreTrainedConfig] = None,
**kwargs, **kwargs,
): ):
r""" r"""
@ -334,7 +334,7 @@ class Blip2Config(PretrainedConfig):
qformer_config (`dict`): qformer_config (`dict`):
Dictionary of configuration options used to initialize [`Blip2QFormerConfig`]. Dictionary of configuration options used to initialize [`Blip2QFormerConfig`].
text_config (`dict`, *optional*): text_config (`dict`, *optional*):
Dictionary of configuration options used to initialize any [`PretrainedConfig`]. Dictionary of configuration options used to initialize any [`PreTrainedConfig`].
Returns: Returns:
[`Blip2Config`]: An instance of a configuration object [`Blip2Config`]: An instance of a configuration object

View File

@ -24,7 +24,7 @@ from packaging import version
if TYPE_CHECKING: if TYPE_CHECKING:
from ... import PreTrainedTokenizer from ... import PreTrainedTokenizer
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfigWithPast, PatchingSpec from ...onnx import OnnxConfigWithPast, PatchingSpec
from ...utils import is_torch_available, logging from ...utils import is_torch_available, logging
@ -32,15 +32,15 @@ from ...utils import is_torch_available, logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BloomConfig(PretrainedConfig): class BloomConfig(PreTrainedConfig):
""" """
This is the configuration class to store the configuration of a [`BloomModel`]. It is used to instantiate a Bloom This is the configuration class to store the configuration of a [`BloomModel`]. It is used to instantiate a Bloom
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
defaults will yield a similar configuration to the Bloom architecture defaults will yield a similar configuration to the Bloom architecture
[bigscience/bloom](https://huggingface.co/bigscience/bloom). [bigscience/bloom](https://huggingface.co/bigscience/bloom).
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
@ -147,7 +147,7 @@ class BloomOnnxConfig(OnnxConfigWithPast):
def __init__( def __init__(
self, self,
config: PretrainedConfig, config: PreTrainedConfig,
task: str = "default", task: str = "default",
patching_specs: Optional[list[PatchingSpec]] = None, patching_specs: Optional[list[PatchingSpec]] = None,
use_past: bool = False, use_past: bool = False,

View File

@ -14,14 +14,14 @@
# limitations under the License. # limitations under the License.
"""Blt model configuration""" """Blt model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BltLocalEncoderConfig(PretrainedConfig): class BltLocalEncoderConfig(PreTrainedConfig):
""" """
Configuration class for the Blt Local Encoder component. Configuration class for the Blt Local Encoder component.
""" """
@ -71,7 +71,7 @@ class BltLocalEncoderConfig(PretrainedConfig):
super().__init__(**kwargs, tie_word_embeddings=False) super().__init__(**kwargs, tie_word_embeddings=False)
class BltLocalDecoderConfig(PretrainedConfig): class BltLocalDecoderConfig(PreTrainedConfig):
""" """
Configuration class for the Blt Local Decoder component. Configuration class for the Blt Local Decoder component.
""" """
@ -121,7 +121,7 @@ class BltLocalDecoderConfig(PretrainedConfig):
super().__init__(**kwargs, tie_word_embeddings=False) super().__init__(**kwargs, tie_word_embeddings=False)
class BltGlobalTransformerConfig(PretrainedConfig): class BltGlobalTransformerConfig(PreTrainedConfig):
""" """
Configuration class for the Blt Global Transformer component. Configuration class for the Blt Global Transformer component.
""" """
@ -163,7 +163,7 @@ class BltGlobalTransformerConfig(PretrainedConfig):
super().__init__(**kwargs, tie_word_embeddings=False) super().__init__(**kwargs, tie_word_embeddings=False)
class BltPatcherConfig(PretrainedConfig): class BltPatcherConfig(PreTrainedConfig):
r""" r"""
Configuration class for the Blt Patcher/Entropy model component. Configuration class for the Blt Patcher/Entropy model component.
@ -239,13 +239,13 @@ class BltPatcherConfig(PretrainedConfig):
super().__init__(**kwargs, tie_word_embeddings=False) super().__init__(**kwargs, tie_word_embeddings=False)
class BltConfig(PretrainedConfig): class BltConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BltModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`BltModel`]. It is used to instantiate a
Blt model according to the specified arguments, defining the model architecture. Blt model according to the specified arguments, defining the model architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 260): vocab_size (`int`, *optional*, defaults to 260):

View File

@ -14,21 +14,21 @@
# limitations under the License. # limitations under the License.
"""BridgeTower model configuration""" """BridgeTower model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BridgeTowerVisionConfig(PretrainedConfig): class BridgeTowerVisionConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the vision configuration of a [`BridgeTowerModel`]. Instantiating a This is the configuration class to store the vision configuration of a [`BridgeTowerModel`]. Instantiating a
configuration with the defaults will yield a similar configuration to that of the bridgetower-base configuration with the defaults will yield a similar configuration to that of the bridgetower-base
[BridgeTower/bridgetower-base](https://huggingface.co/BridgeTower/bridgetower-base/) architecture. [BridgeTower/bridgetower-base](https://huggingface.co/BridgeTower/bridgetower-base/) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
hidden_size (`int`, *optional*, defaults to 768): hidden_size (`int`, *optional*, defaults to 768):
@ -94,15 +94,15 @@ class BridgeTowerVisionConfig(PretrainedConfig):
self.remove_last_layer = remove_last_layer self.remove_last_layer = remove_last_layer
class BridgeTowerTextConfig(PretrainedConfig): class BridgeTowerTextConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the text configuration of a [`BridgeTowerModel`]. The default values here This is the configuration class to store the text configuration of a [`BridgeTowerModel`]. The default values here
are copied from RoBERTa. Instantiating a configuration with the defaults will yield a similar configuration to that are copied from RoBERTa. Instantiating a configuration with the defaults will yield a similar configuration to that
of the bridgetower-base [BridegTower/bridgetower-base](https://huggingface.co/BridgeTower/bridgetower-base/) of the bridgetower-base [BridegTower/bridgetower-base](https://huggingface.co/BridgeTower/bridgetower-base/)
architecture. architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 50265): vocab_size (`int`, *optional*, defaults to 50265):
@ -202,15 +202,15 @@ class BridgeTowerTextConfig(PretrainedConfig):
self.eos_token_id = eos_token_id self.eos_token_id = eos_token_id
class BridgeTowerConfig(PretrainedConfig): class BridgeTowerConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BridgeTowerModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`BridgeTowerModel`]. It is used to instantiate a
BridgeTower model according to the specified arguments, defining the model architecture. Instantiating a BridgeTower model according to the specified arguments, defining the model architecture. Instantiating a
configuration with the defaults will yield a similar configuration to that of the bridgetower-base configuration with the defaults will yield a similar configuration to that of the bridgetower-base
[BridgeTower/bridgetower-base](https://huggingface.co/BridgeTower/bridgetower-base/) architecture. [BridgeTower/bridgetower-base](https://huggingface.co/BridgeTower/bridgetower-base/) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
share_cross_modal_transformer_layers (`bool`, *optional*, defaults to `True`): share_cross_modal_transformer_layers (`bool`, *optional*, defaults to `True`):

View File

@ -14,22 +14,22 @@
# limitations under the License. # limitations under the License.
"""Bros model configuration""" """Bros model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class BrosConfig(PretrainedConfig): class BrosConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`BrosModel`] or a [`TFBrosModel`]. It is used to This is the configuration class to store the configuration of a [`BrosModel`] or a [`TFBrosModel`]. It is used to
instantiate a Bros model according to the specified arguments, defining the model architecture. Instantiating a instantiate a Bros model according to the specified arguments, defining the model architecture. Instantiating a
configuration with the defaults will yield a similar configuration to that of the Bros configuration with the defaults will yield a similar configuration to that of the Bros
[jinho8345/bros-base-uncased](https://huggingface.co/jinho8345/bros-base-uncased) architecture. [jinho8345/bros-base-uncased](https://huggingface.co/jinho8345/bros-base-uncased) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
vocab_size (`int`, *optional*, defaults to 30522): vocab_size (`int`, *optional*, defaults to 30522):

View File

@ -18,7 +18,7 @@
from collections import OrderedDict from collections import OrderedDict
from collections.abc import Mapping from collections.abc import Mapping
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig from ...onnx import OnnxConfig
from ...utils import logging from ...utils import logging
@ -26,15 +26,15 @@ from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class CamembertConfig(PretrainedConfig): class CamembertConfig(PreTrainedConfig):
""" """
This is the configuration class to store the configuration of a [`CamembertModel`] or a [`TFCamembertModel`]. It is This is the configuration class to store the configuration of a [`CamembertModel`] or a [`TFCamembertModel`]. It is
used to instantiate a Camembert model according to the specified arguments, defining the model architecture. used to instantiate a Camembert model according to the specified arguments, defining the model architecture.
Instantiating a configuration with the defaults will yield a similar configuration to that of the Camembert Instantiating a configuration with the defaults will yield a similar configuration to that of the Camembert
[almanach/camembert-base](https://huggingface.co/almanach/camembert-base) architecture. [almanach/camembert-base](https://huggingface.co/almanach/camembert-base) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -14,22 +14,22 @@
# limitations under the License. # limitations under the License.
"""CANINE model configuration""" """CANINE model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class CanineConfig(PretrainedConfig): class CanineConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`CanineModel`]. It is used to instantiate an This is the configuration class to store the configuration of a [`CanineModel`]. It is used to instantiate an
CANINE model according to the specified arguments, defining the model architecture. Instantiating a configuration CANINE model according to the specified arguments, defining the model architecture. Instantiating a configuration
with the defaults will yield a similar configuration to that of the CANINE with the defaults will yield a similar configuration to that of the CANINE
[google/canine-s](https://huggingface.co/google/canine-s) architecture. [google/canine-s](https://huggingface.co/google/canine-s) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -16,19 +16,19 @@
from typing import Optional from typing import Optional
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class ChameleonVQVAEConfig(PretrainedConfig): class ChameleonVQVAEConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`ChameleonVQModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`ChameleonVQModel`]. It is used to instantiate a
`ChameleonVQModel` according to the specified arguments, defining the model architecture. `ChameleonVQModel` according to the specified arguments, defining the model architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. Instantiating a documentation from [`PreTrainedConfig`] for more information. Instantiating a
configuration with the defaults will yield a similar configuration to the VQModel of the configuration with the defaults will yield a similar configuration to the VQModel of the
[meta/chameleon-7B](https://huggingface.co/meta/chameleon-7B). [meta/chameleon-7B](https://huggingface.co/meta/chameleon-7B).
@ -97,15 +97,15 @@ class ChameleonVQVAEConfig(PretrainedConfig):
self.initializer_range = initializer_range self.initializer_range = initializer_range
class ChameleonConfig(PretrainedConfig): class ChameleonConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`ChameleonModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`ChameleonModel`]. It is used to instantiate a
chameleon model according to the specified arguments, defining the model architecture. Instantiating a chameleon model according to the specified arguments, defining the model architecture. Instantiating a
configuration with the defaults will yield a similar configuration to that of the configuration with the defaults will yield a similar configuration to that of the
[meta/chameleon-7B](https://huggingface.co/meta/chameleon-7B). [meta/chameleon-7B](https://huggingface.co/meta/chameleon-7B).
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:

View File

@ -22,7 +22,7 @@ from typing import TYPE_CHECKING, Any
if TYPE_CHECKING: if TYPE_CHECKING:
from ...processing_utils import ProcessorMixin from ...processing_utils import ProcessorMixin
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...onnx import OnnxConfig from ...onnx import OnnxConfig
from ...utils import logging from ...utils import logging
@ -30,7 +30,7 @@ from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class ChineseCLIPTextConfig(PretrainedConfig): class ChineseCLIPTextConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`ChineseCLIPModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`ChineseCLIPModel`]. It is used to instantiate a
Chinese CLIP model according to the specified arguments, defining the model architecture. Instantiating a Chinese CLIP model according to the specified arguments, defining the model architecture. Instantiating a
@ -38,8 +38,8 @@ class ChineseCLIPTextConfig(PretrainedConfig):
[OFA-Sys/chinese-clip-vit-base-patch16](https: [OFA-Sys/chinese-clip-vit-base-patch16](https:
//huggingface.co/OFA-Sys/chinese-clip-vit-base-patch16) architecture. //huggingface.co/OFA-Sys/chinese-clip-vit-base-patch16) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
@ -142,15 +142,15 @@ class ChineseCLIPTextConfig(PretrainedConfig):
self.use_cache = use_cache self.use_cache = use_cache
class ChineseCLIPVisionConfig(PretrainedConfig): class ChineseCLIPVisionConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`ChineseCLIPModel`]. It is used to instantiate an This is the configuration class to store the configuration of a [`ChineseCLIPModel`]. It is used to instantiate an
ChineseCLIP model according to the specified arguments, defining the model architecture. Instantiating a ChineseCLIP model according to the specified arguments, defining the model architecture. Instantiating a
configuration with the defaults will yield a similar configuration to that of the ChineseCLIP configuration with the defaults will yield a similar configuration to that of the ChineseCLIP
[OFA-Sys/chinese-clip-vit-base-patch16](https://huggingface.co/OFA-Sys/chinese-clip-vit-base-patch16) architecture. [OFA-Sys/chinese-clip-vit-base-patch16](https://huggingface.co/OFA-Sys/chinese-clip-vit-base-patch16) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
@ -233,7 +233,7 @@ class ChineseCLIPVisionConfig(PretrainedConfig):
self.hidden_act = hidden_act self.hidden_act = hidden_act
class ChineseCLIPConfig(PretrainedConfig): class ChineseCLIPConfig(PreTrainedConfig):
r""" r"""
[`ChineseCLIPConfig`] is the configuration class to store the configuration of a [`ChineseCLIPModel`]. It is used [`ChineseCLIPConfig`] is the configuration class to store the configuration of a [`ChineseCLIPModel`]. It is used
to instantiate Chinese-CLIP model according to the specified arguments, defining the text model and vision model to instantiate Chinese-CLIP model according to the specified arguments, defining the text model and vision model
@ -241,8 +241,8 @@ class ChineseCLIPConfig(PretrainedConfig):
Chinese-CLIP [OFA-Sys/chinese-clip-vit-base-patch16](https://huggingface.co/OFA-Sys/chinese-clip-vit-base-patch16) Chinese-CLIP [OFA-Sys/chinese-clip-vit-base-patch16](https://huggingface.co/OFA-Sys/chinese-clip-vit-base-patch16)
architecture. architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
text_config (`dict`, *optional*): text_config (`dict`, *optional*):

View File

@ -14,22 +14,22 @@
# limitations under the License. # limitations under the License.
"""CLAP model configuration""" """CLAP model configuration"""
from ...configuration_utils import PretrainedConfig from ...configuration_utils import PreTrainedConfig
from ...utils import logging from ...utils import logging
logger = logging.get_logger(__name__) logger = logging.get_logger(__name__)
class ClapTextConfig(PretrainedConfig): class ClapTextConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`ClapTextModel`]. It is used to instantiate a CLAP This is the configuration class to store the configuration of a [`ClapTextModel`]. It is used to instantiate a CLAP
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
defaults will yield a similar configuration to that of the CLAP defaults will yield a similar configuration to that of the CLAP
[calp-hsat-fused](https://huggingface.co/laion/clap-hsat-fused) architecture. [calp-hsat-fused](https://huggingface.co/laion/clap-hsat-fused) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
@ -136,15 +136,15 @@ class ClapTextConfig(PretrainedConfig):
self.projection_dim = projection_dim self.projection_dim = projection_dim
class ClapAudioConfig(PretrainedConfig): class ClapAudioConfig(PreTrainedConfig):
r""" r"""
This is the configuration class to store the configuration of a [`ClapAudioModel`]. It is used to instantiate a This is the configuration class to store the configuration of a [`ClapAudioModel`]. It is used to instantiate a
CLAP audio encoder according to the specified arguments, defining the model architecture. Instantiating a CLAP audio encoder according to the specified arguments, defining the model architecture. Instantiating a
configuration with the defaults will yield a similar configuration to that of the audio encoder of the CLAP configuration with the defaults will yield a similar configuration to that of the audio encoder of the CLAP
[laion/clap-htsat-fused](https://huggingface.co/laion/clap-htsat-fused) architecture. [laion/clap-htsat-fused](https://huggingface.co/laion/clap-htsat-fused) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
window_size (`int`, *optional*, defaults to 8): window_size (`int`, *optional*, defaults to 8):
@ -289,15 +289,15 @@ class ClapAudioConfig(PretrainedConfig):
self.projection_hidden_act = projection_hidden_act self.projection_hidden_act = projection_hidden_act
class ClapConfig(PretrainedConfig): class ClapConfig(PreTrainedConfig):
r""" r"""
[`ClapConfig`] is the configuration class to store the configuration of a [`ClapModel`]. It is used to instantiate [`ClapConfig`] is the configuration class to store the configuration of a [`ClapModel`]. It is used to instantiate
a CLAP model according to the specified arguments, defining the text model and audio model configs. Instantiating a a CLAP model according to the specified arguments, defining the text model and audio model configs. Instantiating a
configuration with the defaults will yield a similar configuration to that of the CLAP configuration with the defaults will yield a similar configuration to that of the CLAP
[laion/clap-htsat-fused](https://huggingface.co/laion/clap-htsat-fused) architecture. [laion/clap-htsat-fused](https://huggingface.co/laion/clap-htsat-fused) architecture.
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information. documentation from [`PreTrainedConfig`] for more information.
Args: Args:
text_config (`dict`, *optional*): text_config (`dict`, *optional*):

Some files were not shown because too many files have changed in this diff Show More