mirror of
https://github.com/huggingface/transformers.git
synced 2025-10-20 17:13:56 +08:00
Standardize PretrainedConfig
to PreTrainedConfig
(#41300)
* replace * add metaclass for full BC * doc * consistency * update deprecation message * revert
This commit is contained in:
@ -52,7 +52,7 @@
|
||||
<figcaption class="mt-2 text-center text-sm text-gray-500">الصورة توضح مخطط مراحل نموذج Swin.</figcaption>
|
||||
</div>
|
||||
|
||||
يسمح لك [`AutoBackbone`] باستخدام النماذج المُدربة مسبقًا كعمود فقري للحصول على خرائط ميزات من مراحل مختلفة من العمود الفقري. يجب عليك تحديد أحد المعلمات التالية في [`~PretrainedConfig.from_pretrained`]:
|
||||
يسمح لك [`AutoBackbone`] باستخدام النماذج المُدربة مسبقًا كعمود فقري للحصول على خرائط ميزات من مراحل مختلفة من العمود الفقري. يجب عليك تحديد أحد المعلمات التالية في [`~PreTrainedConfig.from_pretrained`]:
|
||||
|
||||
* `out_indices` هو فهرس الطبقة التي تريد الحصول على خريطة الميزات منها
|
||||
* `out_features` هو اسم الطبقة التي تريد الحصول على خريطة الميزات منها
|
||||
|
@ -54,19 +54,19 @@ DistilBertConfig {
|
||||
|
||||
```
|
||||
|
||||
يمكن تعديل خصائص النموذج المدرب مسبقًا في دالة [`~PretrainedConfig.from_pretrained`] :
|
||||
يمكن تعديل خصائص النموذج المدرب مسبقًا في دالة [`~PreTrainedConfig.from_pretrained`] :
|
||||
|
||||
```py
|
||||
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
|
||||
```
|
||||
|
||||
بمجرد أن تصبح راضيًا عن تكوين نموذجك، يمكنك حفظه باستخدام [`~PretrainedConfig.save_pretrained`]. يتم تخزين ملف التكوين الخاص بك على أنه ملف JSON في دليل الحفظ المحدد:
|
||||
بمجرد أن تصبح راضيًا عن تكوين نموذجك، يمكنك حفظه باستخدام [`~PreTrainedConfig.save_pretrained`]. يتم تخزين ملف التكوين الخاص بك على أنه ملف JSON في دليل الحفظ المحدد:
|
||||
|
||||
```py
|
||||
>>> my_config.save_pretrained(save_directory="./your_model_save_path")
|
||||
```
|
||||
|
||||
لإعادة استخدام ملف التكوين، قم بتحميله باستخدام [`~PretrainedConfig.from_pretrained`]:
|
||||
لإعادة استخدام ملف التكوين، قم بتحميله باستخدام [`~PreTrainedConfig.from_pretrained`]:
|
||||
|
||||
```py
|
||||
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")
|
||||
|
@ -20,11 +20,11 @@
|
||||
في مثالنا، سنعدّل بعض الوسائط في فئة ResNet التي قد نرغب في ضبطها. ستعطينا التكوينات المختلفة أنواع ResNets المختلفة الممكنة. سنقوم بتخزين هذه الوسائط بعد التحقق من صحته.
|
||||
|
||||
```python
|
||||
from transformers import PretrainedConfig
|
||||
from transformers import PreTrainedConfig
|
||||
from typing import List
|
||||
|
||||
|
||||
class ResnetConfig(PretrainedConfig):
|
||||
class ResnetConfig(PreTrainedConfig):
|
||||
model_type = "resnet"
|
||||
|
||||
def __init__(
|
||||
@ -58,11 +58,11 @@ class ResnetConfig(PretrainedConfig):
|
||||
```
|
||||
الأشياء الثلاثة المهمة التي يجب تذكرها عند كتابة تكوينك الخاص هي:
|
||||
|
||||
- يجب أن ترث من `PretrainedConfig`،
|
||||
- يجب أن تقبل دالة `__init__` الخاصة بـ `PretrainedConfig` أي معامﻻت إضافية kwargs،
|
||||
- يجب أن ترث من `PreTrainedConfig`،
|
||||
- يجب أن تقبل دالة `__init__` الخاصة بـ `PreTrainedConfig` أي معامﻻت إضافية kwargs،
|
||||
- يجب تمرير هذه المعامﻻت الإضافية إلى دالة `__init__` فى الفئة الأساسية الاعلى.
|
||||
|
||||
يضمن الإرث حصولك على جميع الوظائف من مكتبة 🤗 Transformers، في حين أن القيدين التانى والثالث يأتيان من حقيقة أن `PretrainedConfig` لديه المزيد من الحقول أكثر من تلك التي تقوم بتعيينها. عند إعادة تحميل تكوين باستخدام طريقة `from_pretrained`، يجب أن يقبل تكوينك هذه الحقول ثم إرسالها إلى الفئة الأساسية الأعلى.
|
||||
يضمن الإرث حصولك على جميع الوظائف من مكتبة 🤗 Transformers، في حين أن القيدين التانى والثالث يأتيان من حقيقة أن `PreTrainedConfig` لديه المزيد من الحقول أكثر من تلك التي تقوم بتعيينها. عند إعادة تحميل تكوين باستخدام طريقة `from_pretrained`، يجب أن يقبل تكوينك هذه الحقول ثم إرسالها إلى الفئة الأساسية الأعلى.
|
||||
|
||||
تحديد `model_type` لتكوينك (هنا `model_type="resnet"`) ليس إلزاميًا، ما لم ترغب في
|
||||
تسجيل نموذجك باستخدام الفئات التلقائية (راجع القسم الأخير).
|
||||
@ -82,7 +82,7 @@ resnet50d_config.save_pretrained("custom-resnet")
|
||||
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
|
||||
```
|
||||
|
||||
يمكنك أيضًا استخدام أي طريقة أخرى من فئة [`PretrainedConfig`]، مثل [`~PretrainedConfig.push_to_hub`] لتحميل تكوينك مباشرة إلى Hub.
|
||||
يمكنك أيضًا استخدام أي طريقة أخرى من فئة [`PreTrainedConfig`]، مثل [`~PreTrainedConfig.push_to_hub`] لتحميل تكوينك مباشرة إلى Hub.
|
||||
|
||||
## كتابة نموذج مخصص
|
||||
|
||||
|
@ -53,7 +53,7 @@ Lassen Sie uns daher ein wenig tiefer in das allgemeine Design der Bibliothek ei
|
||||
### Überblick über die Modelle
|
||||
|
||||
Um ein Modell erfolgreich hinzuzufügen, ist es wichtig, die Interaktion zwischen Ihrem Modell und seiner Konfiguration zu verstehen,
|
||||
[`PreTrainedModel`] und [`PretrainedConfig`]. Als Beispiel werden wir
|
||||
[`PreTrainedModel`] und [`PreTrainedConfig`]. Als Beispiel werden wir
|
||||
das Modell, das zu 🤗 Transformers hinzugefügt werden soll, `BrandNewBert` nennen.
|
||||
|
||||
Schauen wir uns das mal an:
|
||||
@ -81,10 +81,10 @@ model.config # model has access to its config
|
||||
```
|
||||
|
||||
Ähnlich wie das Modell erbt die Konfiguration grundlegende Serialisierungs- und Deserialisierungsfunktionalitäten von
|
||||
[`PretrainedConfig`]. Beachten Sie, dass die Konfiguration und das Modell immer in zwei verschiedene Formate serialisiert werden
|
||||
[`PreTrainedConfig`]. Beachten Sie, dass die Konfiguration und das Modell immer in zwei verschiedene Formate serialisiert werden
|
||||
unterschiedliche Formate serialisiert werden - das Modell in eine *pytorch_model.bin* Datei und die Konfiguration in eine *config.json* Datei. Aufruf von
|
||||
[`~PreTrainedModel.save_pretrained`] wird automatisch
|
||||
[`~PretrainedConfig.save_pretrained`] auf, so dass sowohl das Modell als auch die Konfiguration gespeichert werden.
|
||||
[`~PreTrainedConfig.save_pretrained`] auf, so dass sowohl das Modell als auch die Konfiguration gespeichert werden.
|
||||
|
||||
|
||||
### Code-Stil
|
||||
|
@ -51,7 +51,7 @@ This section describes how the model and configuration classes interact and the
|
||||
|
||||
### Model and configuration
|
||||
|
||||
All Transformers' models inherit from a base [`PreTrainedModel`] and [`PretrainedConfig`] class. The configuration is the models blueprint.
|
||||
All Transformers' models inherit from a base [`PreTrainedModel`] and [`PreTrainedConfig`] class. The configuration is the models blueprint.
|
||||
|
||||
There is never more than two levels of abstraction for any model to keep the code readable. The example model here, BrandNewLlama, inherits from `BrandNewLlamaPreTrainedModel` and [`PreTrainedModel`]. It is important that a new model only depends on [`PreTrainedModel`] so that it can use the [`~PreTrainedModel.from_pretrained`] and [`~PreTrainedModel.save_pretrained`] methods.
|
||||
|
||||
@ -66,9 +66,9 @@ model = BrandNewLlamaModel.from_pretrained("username/brand_new_llama")
|
||||
model.config
|
||||
```
|
||||
|
||||
[`PretrainedConfig`] provides the [`~PretrainedConfig.from_pretrained`] and [`~PretrainedConfig.save_pretrained`] methods.
|
||||
[`PreTrainedConfig`] provides the [`~PreTrainedConfig.from_pretrained`] and [`~PreTrainedConfig.save_pretrained`] methods.
|
||||
|
||||
When you use [`PreTrainedModel.save_pretrained`], it automatically calls [`PretrainedConfig.save_pretrained`] so that both the model and configuration are saved together.
|
||||
When you use [`PreTrainedModel.save_pretrained`], it automatically calls [`PreTrainedConfig.save_pretrained`] so that both the model and configuration are saved together.
|
||||
|
||||
A model is saved to a `model.safetensors` file and a configuration is saved to a `config.json` file.
|
||||
|
||||
|
@ -22,7 +22,7 @@ Higher-level computer visions tasks, such as object detection or image segmentat
|
||||
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Backbone.png"/>
|
||||
</div>
|
||||
|
||||
Load a backbone with [`~PretrainedConfig.from_pretrained`] and use the `out_indices` parameter to determine which layer, given by the index, to extract a feature map from.
|
||||
Load a backbone with [`~PreTrainedConfig.from_pretrained`] and use the `out_indices` parameter to determine which layer, given by the index, to extract a feature map from.
|
||||
|
||||
```py
|
||||
from transformers import AutoBackbone
|
||||
@ -46,7 +46,7 @@ There are two ways to load a Transformers backbone, [`AutoBackbone`] and a model
|
||||
<hfoptions id="backbone-classes">
|
||||
<hfoption id="AutoBackbone">
|
||||
|
||||
The [AutoClass](./model_doc/auto) API automatically loads a pretrained vision model with [`~PretrainedConfig.from_pretrained`] as a backbone if it's supported.
|
||||
The [AutoClass](./model_doc/auto) API automatically loads a pretrained vision model with [`~PreTrainedConfig.from_pretrained`] as a backbone if it's supported.
|
||||
|
||||
Set the `out_indices` parameter to the layer you'd like to get the feature map from. If you know the name of the layer, you could also use `out_features`. These parameters can be used interchangeably, but if you use both, make sure they refer to the same layer.
|
||||
|
||||
|
@ -25,12 +25,12 @@ This guide will show you how to customize a ResNet model, enable [AutoClass](./m
|
||||
|
||||
## Configuration
|
||||
|
||||
A configuration, given by the base [`PretrainedConfig`] class, contains all the necessary information to build a model. This is where you'll configure the attributes of the custom ResNet model. Different attributes gives different ResNet model types.
|
||||
A configuration, given by the base [`PreTrainedConfig`] class, contains all the necessary information to build a model. This is where you'll configure the attributes of the custom ResNet model. Different attributes gives different ResNet model types.
|
||||
|
||||
The main rules for customizing a configuration are:
|
||||
|
||||
1. A custom configuration must subclass [`PretrainedConfig`]. This ensures a custom model has all the functionality of a Transformers' model such as [`~PretrainedConfig.from_pretrained`], [`~PretrainedConfig.save_pretrained`], and [`~PretrainedConfig.push_to_hub`].
|
||||
2. The [`PretrainedConfig`] `__init__` must accept any `kwargs` and they must be passed to the superclass `__init__`. [`PretrainedConfig`] has more fields than the ones set in your custom configuration, so when you load a configuration with [`~PretrainedConfig.from_pretrained`], those fields need to be accepted by your configuration and passed to the superclass.
|
||||
1. A custom configuration must subclass [`PreTrainedConfig`]. This ensures a custom model has all the functionality of a Transformers' model such as [`~PreTrainedConfig.from_pretrained`], [`~PreTrainedConfig.save_pretrained`], and [`~PreTrainedConfig.push_to_hub`].
|
||||
2. The [`PreTrainedConfig`] `__init__` must accept any `kwargs` and they must be passed to the superclass `__init__`. [`PreTrainedConfig`] has more fields than the ones set in your custom configuration, so when you load a configuration with [`~PreTrainedConfig.from_pretrained`], those fields need to be accepted by your configuration and passed to the superclass.
|
||||
|
||||
> [!TIP]
|
||||
> It is useful to check the validity of some of the parameters. In the example below, a check is implemented to ensure `block_type` and `stem_type` belong to one of the predefined values.
|
||||
@ -38,10 +38,10 @@ The main rules for customizing a configuration are:
|
||||
> Add `model_type` to the configuration class to enable [AutoClass](./models#autoclass) support.
|
||||
|
||||
```py
|
||||
from transformers import PretrainedConfig
|
||||
from transformers import PreTrainedConfig
|
||||
from typing import List
|
||||
|
||||
class ResnetConfig(PretrainedConfig):
|
||||
class ResnetConfig(PreTrainedConfig):
|
||||
model_type = "resnet"
|
||||
|
||||
def __init__(
|
||||
@ -74,7 +74,7 @@ class ResnetConfig(PretrainedConfig):
|
||||
super().__init__(**kwargs)
|
||||
```
|
||||
|
||||
Save the configuration to a JSON file in your custom model folder, `custom-resnet`, with [`~PretrainedConfig.save_pretrained`].
|
||||
Save the configuration to a JSON file in your custom model folder, `custom-resnet`, with [`~PreTrainedConfig.save_pretrained`].
|
||||
|
||||
```py
|
||||
resnet50d_config = ResnetConfig(block_type="bottleneck", stem_width=32, stem_type="deep", avg_down=True)
|
||||
@ -83,7 +83,7 @@ resnet50d_config.save_pretrained("custom-resnet")
|
||||
|
||||
## Model
|
||||
|
||||
With the custom ResNet configuration, you can now create and customize the model. The model subclasses the base [`PreTrainedModel`] class. Like [`PretrainedConfig`], inheriting from [`PreTrainedModel`] and initializing the superclass with the configuration extends Transformers' functionalities such as saving and loading to the custom model.
|
||||
With the custom ResNet configuration, you can now create and customize the model. The model subclasses the base [`PreTrainedModel`] class. Like [`PreTrainedConfig`], inheriting from [`PreTrainedModel`] and initializing the superclass with the configuration extends Transformers' functionalities such as saving and loading to the custom model.
|
||||
|
||||
Transformers' models follow the convention of accepting a `config` object in the `__init__` method. This passes the entire `config` to the model sublayers, instead of breaking the `config` object into multiple arguments that are individually passed to the sublayers.
|
||||
|
||||
@ -235,7 +235,7 @@ from resnet_model.configuration_resnet import ResnetConfig
|
||||
from resnet_model.modeling_resnet import ResnetModel, ResnetModelForImageClassification
|
||||
```
|
||||
|
||||
Copy the code from the model and configuration files. To make sure the AutoClass objects are saved with [`~PreTrainedModel.save_pretrained`], call the [`~PretrainedConfig.register_for_auto_class`] method. This modifies the configuration JSON file to include the AutoClass objects and mapping.
|
||||
Copy the code from the model and configuration files. To make sure the AutoClass objects are saved with [`~PreTrainedModel.save_pretrained`], call the [`~PreTrainedConfig.register_for_auto_class`] method. This modifies the configuration JSON file to include the AutoClass objects and mapping.
|
||||
|
||||
For a model, pick the appropriate `AutoModelFor` class based on the task.
|
||||
|
||||
|
@ -16,7 +16,7 @@ rendered properly in your Markdown viewer.
|
||||
|
||||
# Configuration
|
||||
|
||||
The base class [`PretrainedConfig`] implements the common methods for loading/saving a configuration
|
||||
The base class [`PreTrainedConfig`] implements the common methods for loading/saving a configuration
|
||||
either from a local file or directory, or from a pretrained model configuration provided by the library (downloaded
|
||||
from HuggingFace's AWS S3 repository).
|
||||
|
||||
@ -24,8 +24,8 @@ Each derived config class implements model specific attributes. Common attribute
|
||||
`hidden_size`, `num_attention_heads`, and `num_hidden_layers`. Text models further implement:
|
||||
`vocab_size`.
|
||||
|
||||
## PretrainedConfig
|
||||
## PreTrainedConfig
|
||||
|
||||
[[autodoc]] PretrainedConfig
|
||||
[[autodoc]] PreTrainedConfig
|
||||
- push_to_hub
|
||||
- all
|
||||
|
@ -48,7 +48,7 @@ You will then be able to use the auto classes like you would usually do!
|
||||
|
||||
<Tip warning={true}>
|
||||
|
||||
If your `NewModelConfig` is a subclass of [`~transformers.PretrainedConfig`], make sure its
|
||||
If your `NewModelConfig` is a subclass of [`~transformers.PreTrainedConfig`], make sure its
|
||||
`model_type` attribute is set to the same key you use when registering the config (here `"new-model"`).
|
||||
|
||||
Likewise, if your `NewModel` is a subclass of [`PreTrainedModel`], make sure its
|
||||
|
@ -73,7 +73,7 @@ Each pretrained model inherits from three base classes.
|
||||
|
||||
| **Class** | **Description** |
|
||||
|---|---|
|
||||
| [`PretrainedConfig`] | A file that specifies a models attributes such as the number of attention heads or vocabulary size. |
|
||||
| [`PreTrainedConfig`] | A file that specifies a models attributes such as the number of attention heads or vocabulary size. |
|
||||
| [`PreTrainedModel`] | A model (or architecture) defined by the model attributes from the configuration file. A pretrained model only returns the raw hidden states. For a specific task, use the appropriate model head to convert the raw hidden states into a meaningful result (for example, [`LlamaModel`] versus [`LlamaForCausalLM`]). |
|
||||
| Preprocessor | A class for converting raw inputs (text, images, audio, multimodal) into numerical inputs to the model. For example, [`PreTrainedTokenizer`] converts text into tensors and [`ImageProcessingMixin`] converts pixels into tensors. |
|
||||
|
||||
|
@ -21,7 +21,7 @@ rendered properly in your Markdown viewer.
|
||||
Transformers can export a model to TorchScript by:
|
||||
|
||||
1. creating dummy inputs to create a *trace* of the model to serialize to TorchScript
|
||||
2. enabling the `torchscript` parameter in either [`~PretrainedConfig.torchscript`] for a randomly initialized model or [`~PreTrainedModel.from_pretrained`] for a pretrained model
|
||||
2. enabling the `torchscript` parameter in either [`~PreTrainedConfig.torchscript`] for a randomly initialized model or [`~PreTrainedModel.from_pretrained`] for a pretrained model
|
||||
|
||||
## Dummy inputs
|
||||
|
||||
|
@ -135,9 +135,9 @@ class MyModel(PreTrainedModel):
|
||||
|
||||
```python
|
||||
|
||||
from transformers import PretrainedConfig
|
||||
from transformers import PreTrainedConfig
|
||||
|
||||
class MyConfig(PretrainedConfig):
|
||||
class MyConfig(PreTrainedConfig):
|
||||
base_model_tp_plan = {
|
||||
"layers.*.self_attn.k_proj": "colwise",
|
||||
"layers.*.self_attn.v_proj": "colwise",
|
||||
|
@ -83,19 +83,19 @@ DistilBertConfig {
|
||||
}
|
||||
```
|
||||
|
||||
Los atributos de los modelos preentrenados pueden ser modificados con la función [`~PretrainedConfig.from_pretrained`]:
|
||||
Los atributos de los modelos preentrenados pueden ser modificados con la función [`~PreTrainedConfig.from_pretrained`]:
|
||||
|
||||
```py
|
||||
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
|
||||
```
|
||||
|
||||
Cuando estés satisfecho con la configuración de tu modelo, puedes guardarlo con la función [`~PretrainedConfig.save_pretrained`]. Tu configuración se guardará en un archivo JSON dentro del directorio que le especifiques como parámetro.
|
||||
Cuando estés satisfecho con la configuración de tu modelo, puedes guardarlo con la función [`~PreTrainedConfig.save_pretrained`]. Tu configuración se guardará en un archivo JSON dentro del directorio que le especifiques como parámetro.
|
||||
|
||||
```py
|
||||
>>> my_config.save_pretrained(save_directory="./your_model_save_path")
|
||||
```
|
||||
|
||||
Para volver a usar el archivo de configuración, puedes cargarlo usando [`~PretrainedConfig.from_pretrained`]:
|
||||
Para volver a usar el archivo de configuración, puedes cargarlo usando [`~PreTrainedConfig.from_pretrained`]:
|
||||
|
||||
```py
|
||||
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json")
|
||||
|
@ -38,11 +38,11 @@ configuraciones nos darán los diferentes tipos de ResNet que son posibles. Lueg
|
||||
después de verificar la validez de algunos de ellos.
|
||||
|
||||
```python
|
||||
from transformers import PretrainedConfig
|
||||
from transformers import PreTrainedConfig
|
||||
from typing import List
|
||||
|
||||
|
||||
class ResnetConfig(PretrainedConfig):
|
||||
class ResnetConfig(PreTrainedConfig):
|
||||
model_type = "resnet"
|
||||
|
||||
def __init__(
|
||||
@ -76,12 +76,12 @@ class ResnetConfig(PretrainedConfig):
|
||||
```
|
||||
|
||||
Las tres cosas importantes que debes recordar al escribir tu propia configuración son las siguientes:
|
||||
- tienes que heredar de `PretrainedConfig`,
|
||||
- el `__init__` de tu `PretrainedConfig` debe aceptar cualquier `kwargs`,
|
||||
- tienes que heredar de `PreTrainedConfig`,
|
||||
- el `__init__` de tu `PreTrainedConfig` debe aceptar cualquier `kwargs`,
|
||||
- esos `kwargs` deben pasarse a la superclase `__init__`.
|
||||
|
||||
La herencia es para asegurarte de obtener toda la funcionalidad de la biblioteca 🤗 Transformers, mientras que las otras dos
|
||||
restricciones provienen del hecho de que una `PretrainedConfig` tiene más campos que los que estás configurando. Al recargar una
|
||||
restricciones provienen del hecho de que una `PreTrainedConfig` tiene más campos que los que estás configurando. Al recargar una
|
||||
`config` con el método `from_pretrained`, esos campos deben ser aceptados por tu `config` y luego enviados a la superclase.
|
||||
|
||||
Definir un `model_type` para tu configuración (en este caso `model_type="resnet"`) no es obligatorio, a menos que quieras
|
||||
@ -102,7 +102,7 @@ con el método `from_pretrained`:
|
||||
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
|
||||
```
|
||||
|
||||
También puedes usar cualquier otro método de la clase [`PretrainedConfig`], como [`~PretrainedConfig.push_to_hub`], para cargar
|
||||
También puedes usar cualquier otro método de la clase [`PreTrainedConfig`], como [`~PreTrainedConfig.push_to_hub`], para cargar
|
||||
directamente tu configuración en el Hub.
|
||||
|
||||
## Escribir un modelo personalizado
|
||||
|
@ -71,7 +71,7 @@ Pour les tâches de vision, un processeur d'image traite l'image pour la formate
|
||||
<figcaption class="mt-2 text-center text-sm text-gray-500">Un backbone Swin avec plusieurs étapes pour produire une carte de caractéristiques.</figcaption>
|
||||
</div>
|
||||
|
||||
[`AutoBackbone`] vous permet d'utiliser des modèles pré-entraînés comme backbones pour obtenir des cartes de caractéristiques à partir de différentes étapes du backbone. Vous devez spécifier l'un des paramètres suivants dans [`~PretrainedConfig.from_pretrained`] :
|
||||
[`AutoBackbone`] vous permet d'utiliser des modèles pré-entraînés comme backbones pour obtenir des cartes de caractéristiques à partir de différentes étapes du backbone. Vous devez spécifier l'un des paramètres suivants dans [`~PreTrainedConfig.from_pretrained`] :
|
||||
|
||||
* `out_indices` est l'index de la couche dont vous souhaitez obtenir la carte de caractéristiques
|
||||
* `out_features` est le nom de la couche dont vous souhaitez obtenir la carte de caractéristiques
|
||||
|
@ -67,7 +67,7 @@ Tenendo questi principi in mente, immergiamoci nel design generale della libreri
|
||||
### Panoramica sui modelli
|
||||
|
||||
Per aggiungere con successo un modello, é importante capire l'interazione tra il tuo modello e la sua configurazione,
|
||||
[`PreTrainedModel`], e [`PretrainedConfig`]. Per dare un esempio, chiameremo il modello da aggiungere a 🤗 Transformers
|
||||
[`PreTrainedModel`], e [`PreTrainedConfig`]. Per dare un esempio, chiameremo il modello da aggiungere a 🤗 Transformers
|
||||
`BrandNewBert`.
|
||||
|
||||
Diamo un'occhiata:
|
||||
@ -94,9 +94,9 @@ model.config # il modello ha accesso al suo config
|
||||
```
|
||||
|
||||
Analogamente al modello, la configurazione eredita le funzionalità base di serializzazione e deserializzazione da
|
||||
[`PretrainedConfig`]. É da notare che la configurazione e il modello sono sempre serializzati in due formati differenti -
|
||||
[`PreTrainedConfig`]. É da notare che la configurazione e il modello sono sempre serializzati in due formati differenti -
|
||||
il modello é serializzato in un file *pytorch_model.bin* mentre la configurazione con *config.json*. Chiamando
|
||||
[`~PreTrainedModel.save_pretrained`] automaticamente chiamerà [`~PretrainedConfig.save_pretrained`], cosicché sia il
|
||||
[`~PreTrainedModel.save_pretrained`] automaticamente chiamerà [`~PreTrainedConfig.save_pretrained`], cosicché sia il
|
||||
modello che la configurazione siano salvati.
|
||||
|
||||
|
||||
|
@ -83,19 +83,19 @@ DistilBertConfig {
|
||||
}
|
||||
```
|
||||
|
||||
Nella funzione [`~PretrainedConfig.from_pretrained`] possono essere modificati gli attributi del modello pre-allenato:
|
||||
Nella funzione [`~PreTrainedConfig.from_pretrained`] possono essere modificati gli attributi del modello pre-allenato:
|
||||
|
||||
```py
|
||||
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
|
||||
```
|
||||
|
||||
Quando la configurazione del modello ti soddisfa, la puoi salvare con [`~PretrainedConfig.save_pretrained`]. Il file della tua configurazione è memorizzato come file JSON nella save directory specificata:
|
||||
Quando la configurazione del modello ti soddisfa, la puoi salvare con [`~PreTrainedConfig.save_pretrained`]. Il file della tua configurazione è memorizzato come file JSON nella save directory specificata:
|
||||
|
||||
```py
|
||||
>>> my_config.save_pretrained(save_directory="./your_model_save_path")
|
||||
```
|
||||
|
||||
Per riutilizzare la configurazione del file, caricalo con [`~PretrainedConfig.from_pretrained`]:
|
||||
Per riutilizzare la configurazione del file, caricalo con [`~PreTrainedConfig.from_pretrained`]:
|
||||
|
||||
```py
|
||||
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json")
|
||||
|
@ -37,11 +37,11 @@ Configurazioni differenti ci daranno quindi i differenti possibili tipi di ResNe
|
||||
dopo averne controllato la validità.
|
||||
|
||||
```python
|
||||
from transformers import PretrainedConfig
|
||||
from transformers import PreTrainedConfig
|
||||
from typing import List
|
||||
|
||||
|
||||
class ResnetConfig(PretrainedConfig):
|
||||
class ResnetConfig(PreTrainedConfig):
|
||||
model_type = "resnet"
|
||||
|
||||
def __init__(
|
||||
@ -75,12 +75,12 @@ class ResnetConfig(PretrainedConfig):
|
||||
```
|
||||
|
||||
Le tre cose più importanti da ricordare quando scrivi le tue configurazioni sono le seguenti:
|
||||
- Devi ereditare da `Pretrainedconfig`,
|
||||
- Il metodo `__init__` del tuo `Pretrainedconfig` deve accettare i kwargs,
|
||||
- Devi ereditare da `PreTrainedConfig`,
|
||||
- Il metodo `__init__` del tuo `PreTrainedConfig` deve accettare i kwargs,
|
||||
- I `kwargs` devono essere passati alla superclass `__init__`
|
||||
|
||||
L’eredità è importante per assicurarsi di ottenere tutte le funzionalità della libreria 🤗 transformers,
|
||||
mentre gli altri due vincoli derivano dal fatto che un `Pretrainedconfig` ha più campi di quelli che stai settando.
|
||||
mentre gli altri due vincoli derivano dal fatto che un `PreTrainedConfig` ha più campi di quelli che stai settando.
|
||||
Quando ricarichi una config da un metodo `from_pretrained`, questi campi devono essere accettati dalla tua config e
|
||||
poi inviati alla superclasse.
|
||||
|
||||
@ -102,7 +102,7 @@ config con il metodo `from_pretrained`.
|
||||
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
|
||||
```
|
||||
|
||||
Puoi anche usare qualunque altro metodo della classe [`PretrainedConfig`], come [`~PretrainedConfig.push_to_hub`]
|
||||
Puoi anche usare qualunque altro metodo della classe [`PreTrainedConfig`], come [`~PreTrainedConfig.push_to_hub`]
|
||||
per caricare direttamente la tua configurazione nell'hub.
|
||||
|
||||
## Scrivere un modello personalizzato
|
||||
|
@ -51,7 +51,7 @@ Hugging Faceチームのメンバーがサポートを提供するので、一
|
||||
|
||||
### Overview of models
|
||||
|
||||
モデルを正常に追加するためには、モデルとその設定、[`PreTrainedModel`]、および[`PretrainedConfig`]の相互作用を理解することが重要です。
|
||||
モデルを正常に追加するためには、モデルとその設定、[`PreTrainedModel`]、および[`PreTrainedConfig`]の相互作用を理解することが重要です。
|
||||
例示的な目的で、🤗 Transformersに追加するモデルを「BrandNewBert」と呼びます。
|
||||
|
||||
以下をご覧ください:
|
||||
@ -77,7 +77,7 @@ model = BrandNewBertModel.from_pretrained("brandy/brand_new_bert")
|
||||
model.config # model has access to its config
|
||||
```
|
||||
|
||||
モデルと同様に、設定は[`PretrainedConfig`]から基本的なシリアル化および逆シリアル化の機能を継承しています。注意すべきは、設定とモデルは常に2つの異なる形式にシリアル化されることです - モデルは*pytorch_model.bin*ファイルに、設定は*config.json*ファイルにシリアル化されます。[`~PreTrainedModel.save_pretrained`]を呼び出すと、自動的に[`~PretrainedConfig.save_pretrained`]も呼び出され、モデルと設定の両方が保存されます。
|
||||
モデルと同様に、設定は[`PreTrainedConfig`]から基本的なシリアル化および逆シリアル化の機能を継承しています。注意すべきは、設定とモデルは常に2つの異なる形式にシリアル化されることです - モデルは*pytorch_model.bin*ファイルに、設定は*config.json*ファイルにシリアル化されます。[`~PreTrainedModel.save_pretrained`]を呼び出すと、自動的に[`~PreTrainedConfig.save_pretrained`]も呼び出され、モデルと設定の両方が保存されます。
|
||||
|
||||
### Code style
|
||||
|
||||
|
@ -86,19 +86,19 @@ DistilBertConfig {
|
||||
}
|
||||
```
|
||||
|
||||
事前学習済みモデルの属性は、[`~PretrainedConfig.from_pretrained`] 関数で変更できます:
|
||||
事前学習済みモデルの属性は、[`~PreTrainedConfig.from_pretrained`] 関数で変更できます:
|
||||
|
||||
```py
|
||||
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
|
||||
```
|
||||
|
||||
Once you are satisfied with your model configuration, you can save it with [`PretrainedConfig.save_pretrained`]. Your configuration file is stored as a JSON file in the specified save directory.
|
||||
Once you are satisfied with your model configuration, you can save it with [`PreTrainedConfig.save_pretrained`]. Your configuration file is stored as a JSON file in the specified save directory.
|
||||
|
||||
```py
|
||||
>>> my_config.save_pretrained(save_directory="./your_model_save_path")
|
||||
```
|
||||
|
||||
設定ファイルを再利用するには、[`~PretrainedConfig.from_pretrained`]を使用してそれをロードします:
|
||||
設定ファイルを再利用するには、[`~PreTrainedConfig.from_pretrained`]を使用してそれをロードします:
|
||||
|
||||
```py
|
||||
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")
|
||||
|
@ -29,11 +29,11 @@ rendered properly in your Markdown viewer.
|
||||
この例では、ResNetクラスのいくつかの引数を取得し、調整したいかもしれないとします。異なる設定は、異なるタイプのResNetを提供します。その後、これらの引数を確認した後、それらの引数を単に格納します。
|
||||
|
||||
```python
|
||||
from transformers import PretrainedConfig
|
||||
from transformers import PreTrainedConfig
|
||||
from typing import List
|
||||
|
||||
|
||||
class ResnetConfig(PretrainedConfig):
|
||||
class ResnetConfig(PreTrainedConfig):
|
||||
model_type = "resnet"
|
||||
|
||||
def __init__(
|
||||
@ -67,12 +67,12 @@ class ResnetConfig(PretrainedConfig):
|
||||
```
|
||||
|
||||
重要なことを3つ覚えておくべきポイントは次のとおりです:
|
||||
- `PretrainedConfig` を継承する必要があります。
|
||||
- あなたの `PretrainedConfig` の `__init__` は任意の kwargs を受け入れる必要があります。
|
||||
- `PreTrainedConfig` を継承する必要があります。
|
||||
- あなたの `PreTrainedConfig` の `__init__` は任意の kwargs を受け入れる必要があります。
|
||||
- これらの `kwargs` は親クラスの `__init__` に渡す必要があります。
|
||||
|
||||
継承は、🤗 Transformers ライブラリのすべての機能を取得できるようにするためです。他の2つの制約は、
|
||||
`PretrainedConfig` が設定しているフィールド以外にも多くのフィールドを持っていることから来ています。
|
||||
`PreTrainedConfig` が設定しているフィールド以外にも多くのフィールドを持っていることから来ています。
|
||||
`from_pretrained` メソッドで設定を再ロードする場合、これらのフィールドはあなたの設定に受け入れられ、
|
||||
その後、親クラスに送信される必要があります。
|
||||
|
||||
@ -95,7 +95,7 @@ resnet50d_config.save_pretrained("custom-resnet")
|
||||
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
|
||||
```
|
||||
|
||||
また、[`PretrainedConfig`] クラスの他のメソッドを使用することもできます。たとえば、[`~PretrainedConfig.push_to_hub`] を使用して、設定を直接 Hub にアップロードできます。
|
||||
また、[`PreTrainedConfig`] クラスの他のメソッドを使用することもできます。たとえば、[`~PreTrainedConfig.push_to_hub`] を使用して、設定を直接 Hub にアップロードできます。
|
||||
|
||||
## Writing a custom model
|
||||
|
||||
|
@ -16,7 +16,7 @@ rendered properly in your Markdown viewer.
|
||||
|
||||
# 構成
|
||||
|
||||
基本クラス [`PretrainedConfig`] は、設定をロード/保存するための一般的なメソッドを実装します。
|
||||
基本クラス [`PreTrainedConfig`] は、設定をロード/保存するための一般的なメソッドを実装します。
|
||||
ローカル ファイルまたはディレクトリから、またはライブラリ (ダウンロードされた) によって提供される事前トレーニング済みモデル構成から
|
||||
HuggingFace の AWS S3 リポジトリから)。
|
||||
|
||||
@ -24,8 +24,8 @@ HuggingFace の AWS S3 リポジトリから)。
|
||||
`hidden_size`、`num_attention_heads`、および `num_hidden_layers`。テキスト モデルはさらに以下を実装します。
|
||||
`vocab_size`。
|
||||
|
||||
## PretrainedConfig
|
||||
## PreTrainedConfig
|
||||
|
||||
[[autodoc]] PretrainedConfig
|
||||
[[autodoc]] PreTrainedConfig
|
||||
- push_to_hub
|
||||
- all
|
||||
|
@ -43,7 +43,7 @@ AutoModel.register(NewModelConfig, NewModel)
|
||||
|
||||
<Tip warning={true}>
|
||||
|
||||
あなたの`NewModelConfig`が[`~transformers.PretrainedConfig`]のサブクラスである場合、その`model_type`属性がコンフィグを登録するときに使用するキー(ここでは`"new-model"`)と同じに設定されていることを確認してください。
|
||||
あなたの`NewModelConfig`が[`~transformers.PreTrainedConfig`]のサブクラスである場合、その`model_type`属性がコンフィグを登録するときに使用するキー(ここでは`"new-model"`)と同じに設定されていることを確認してください。
|
||||
|
||||
同様に、あなたの`NewModel`が[`PreTrainedModel`]のサブクラスである場合、その`config_class`属性がモデルを登録する際に使用するクラス(ここでは`NewModelConfig`)と同じに設定されていることを確認してください。
|
||||
|
||||
|
@ -46,7 +46,7 @@ Hugging Face 팀은 항상 도움을 줄 준비가 되어 있으므로 혼자가
|
||||
|
||||
### 모델 개요 [[overview-of-models]]
|
||||
|
||||
모델을 성공적으로 추가하려면 모델과 해당 구성인 [`PreTrainedModel`] 및 [`PretrainedConfig`] 간의 상호작용을 이해하는 것이 중요합니다. 예를 들어, 🤗 Transformers에 추가하려는 모델을 `BrandNewBert`라고 부르겠습니다.
|
||||
모델을 성공적으로 추가하려면 모델과 해당 구성인 [`PreTrainedModel`] 및 [`PreTrainedConfig`] 간의 상호작용을 이해하는 것이 중요합니다. 예를 들어, 🤗 Transformers에 추가하려는 모델을 `BrandNewBert`라고 부르겠습니다.
|
||||
|
||||
다음을 살펴보겠습니다:
|
||||
|
||||
@ -59,7 +59,7 @@ model = BrandNewBertModel.from_pretrained("brandy/brand_new_bert")
|
||||
model.config # model has access to its config
|
||||
```
|
||||
|
||||
모델과 마찬가지로 구성은 [`PretrainedConfig`]에서 기본 직렬화 및 역직렬화 기능을 상속받습니다. 구성과 모델은 항상 *pytorch_model.bin* 파일과 *config.json* 파일로 각각 별도로 직렬화됩니다. [`~PreTrainedModel.save_pretrained`]를 호출하면 자동으로 [`~PretrainedConfig.save_pretrained`]도 호출되므로 모델과 구성이 모두 저장됩니다.
|
||||
모델과 마찬가지로 구성은 [`PreTrainedConfig`]에서 기본 직렬화 및 역직렬화 기능을 상속받습니다. 구성과 모델은 항상 *pytorch_model.bin* 파일과 *config.json* 파일로 각각 별도로 직렬화됩니다. [`~PreTrainedModel.save_pretrained`]를 호출하면 자동으로 [`~PreTrainedConfig.save_pretrained`]도 호출되므로 모델과 구성이 모두 저장됩니다.
|
||||
|
||||
|
||||
### 코드 스타일 [[code-style]]
|
||||
|
@ -36,11 +36,11 @@ rendered properly in your Markdown viewer.
|
||||
그런 다음 몇 가지 유효성을 확인한 후 해당 인수를 저장합니다.
|
||||
|
||||
```python
|
||||
from transformers import PretrainedConfig
|
||||
from transformers import PreTrainedConfig
|
||||
from typing import List
|
||||
|
||||
|
||||
class ResnetConfig(PretrainedConfig):
|
||||
class ResnetConfig(PreTrainedConfig):
|
||||
model_type = "resnet"
|
||||
|
||||
def __init__(
|
||||
@ -74,12 +74,12 @@ class ResnetConfig(PretrainedConfig):
|
||||
```
|
||||
|
||||
사용자 정의 `configuration`을 작성할 때 기억해야 할 세 가지 중요한 사항은 다음과 같습니다:
|
||||
- `PretrainedConfig`을 상속해야 합니다.
|
||||
- `PretrainedConfig`의 `__init__`은 모든 kwargs를 허용해야 하고,
|
||||
- `PreTrainedConfig`을 상속해야 합니다.
|
||||
- `PreTrainedConfig`의 `__init__`은 모든 kwargs를 허용해야 하고,
|
||||
- 이러한 `kwargs`는 상위 클래스 `__init__`에 전달되어야 합니다.
|
||||
|
||||
상속은 🤗 Transformers 라이브러리에서 모든 기능을 가져오는 것입니다.
|
||||
이러한 점으로부터 비롯되는 두 가지 제약 조건은 `PretrainedConfig`에 설정하는 것보다 더 많은 필드가 있습니다.
|
||||
이러한 점으로부터 비롯되는 두 가지 제약 조건은 `PreTrainedConfig`에 설정하는 것보다 더 많은 필드가 있습니다.
|
||||
`from_pretrained` 메서드로 구성을 다시 로드할 때 해당 필드는 구성에서 수락한 후 상위 클래스로 보내야 합니다.
|
||||
|
||||
모델을 auto 클래스에 등록하지 않는 한, `configuration`에서 `model_type`을 정의(여기서 `model_type="resnet"`)하는 것은 필수 사항이 아닙니다 (마지막 섹션 참조).
|
||||
@ -99,7 +99,7 @@ resnet50d_config.save_pretrained("custom-resnet")
|
||||
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
|
||||
```
|
||||
|
||||
구성을 Hub에 직접 업로드하기 위해 [`PretrainedConfig`] 클래스의 [`~PretrainedConfig.push_to_hub`]와 같은 다른 메서드를 사용할 수 있습니다.
|
||||
구성을 Hub에 직접 업로드하기 위해 [`PreTrainedConfig`] 클래스의 [`~PreTrainedConfig.push_to_hub`]와 같은 다른 메서드를 사용할 수 있습니다.
|
||||
|
||||
|
||||
## 사용자 정의 모델 작성하기[[writing-a-custom-model]]
|
||||
|
@ -16,13 +16,13 @@ rendered properly in your Markdown viewer.
|
||||
|
||||
# 구성[[configuration]]
|
||||
|
||||
기본 클래스 [`PretrainedConfig`]는 로컬 파일이나 디렉토리, 또는 라이브러리에서 제공하는 사전 학습된 모델 구성(HuggingFace의 AWS S3 저장소에서 다운로드됨)으로부터 구성을 불러오거나 저장하는 공통 메서드를 구현합니다. 각 파생 구성 클래스는 모델별 특성을 구현합니다.
|
||||
기본 클래스 [`PreTrainedConfig`]는 로컬 파일이나 디렉토리, 또는 라이브러리에서 제공하는 사전 학습된 모델 구성(HuggingFace의 AWS S3 저장소에서 다운로드됨)으로부터 구성을 불러오거나 저장하는 공통 메서드를 구현합니다. 각 파생 구성 클래스는 모델별 특성을 구현합니다.
|
||||
|
||||
모든 구성 클래스에 존재하는 공통 속성은 다음과 같습니다: `hidden_size`, `num_attention_heads`, `num_hidden_layers`. 텍스트 모델은 추가로 `vocab_size`를 구현합니다.
|
||||
|
||||
|
||||
## PretrainedConfig[[transformers.PretrainedConfig]]
|
||||
## PreTrainedConfig[[transformers.PreTrainedConfig]]
|
||||
|
||||
[[autodoc]] PretrainedConfig
|
||||
[[autodoc]] PreTrainedConfig
|
||||
- push_to_hub
|
||||
- all
|
||||
|
@ -44,7 +44,7 @@ AutoModel.register(NewModelConfig, NewModel)
|
||||
|
||||
<Tip warning={true}>
|
||||
|
||||
만약 `NewModelConfig`가 [`~transformers.PretrainedConfig`]의 서브클래스라면, 해당 `model_type` 속성이 등록할 때 사용하는 키(여기서는 `"new-model"`)와 동일하게 설정되어 있는지 확인하세요.
|
||||
만약 `NewModelConfig`가 [`~transformers.PreTrainedConfig`]의 서브클래스라면, 해당 `model_type` 속성이 등록할 때 사용하는 키(여기서는 `"new-model"`)와 동일하게 설정되어 있는지 확인하세요.
|
||||
|
||||
마찬가지로, `NewModel`이 [`PreTrainedModel`]의 서브클래스라면, 해당 `config_class` 속성이 등록할 때 사용하는 클래스(여기서는 `NewModelConfig`)와 동일하게 설정되어 있는지 확인하세요.
|
||||
|
||||
|
@ -83,19 +83,19 @@ DistilBertConfig {
|
||||
}
|
||||
```
|
||||
|
||||
Atributos de um modelo pré-treinado podem ser modificados na função [`~PretrainedConfig.from_pretrained`]:
|
||||
Atributos de um modelo pré-treinado podem ser modificados na função [`~PreTrainedConfig.from_pretrained`]:
|
||||
|
||||
```py
|
||||
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
|
||||
```
|
||||
|
||||
Uma vez que você está satisfeito com as configurações do seu modelo, você consegue salvar elas com [`~PretrainedConfig.save_pretrained`]. Seu arquivo de configurações está salvo como um arquivo JSON no diretório especificado:
|
||||
Uma vez que você está satisfeito com as configurações do seu modelo, você consegue salvar elas com [`~PreTrainedConfig.save_pretrained`]. Seu arquivo de configurações está salvo como um arquivo JSON no diretório especificado:
|
||||
|
||||
```py
|
||||
>>> my_config.save_pretrained(save_directory="./your_model_save_path")
|
||||
```
|
||||
|
||||
Para reusar o arquivo de configurações, carregue com [`~PretrainedConfig.from_pretrained`]:
|
||||
Para reusar o arquivo de configurações, carregue com [`~PreTrainedConfig.from_pretrained`]:
|
||||
|
||||
```py
|
||||
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json")
|
||||
|
@ -37,11 +37,11 @@ configurações nos dará os diferentes tipos de ResNets que são possíveis. Em
|
||||
após verificar a validade de alguns deles.
|
||||
|
||||
```python
|
||||
from transformers import PretrainedConfig
|
||||
from transformers import PreTrainedConfig
|
||||
from typing import List
|
||||
|
||||
|
||||
class ResnetConfig(PretrainedConfig):
|
||||
class ResnetConfig(PreTrainedConfig):
|
||||
model_type = "resnet"
|
||||
|
||||
def __init__(
|
||||
@ -75,12 +75,12 @@ class ResnetConfig(PretrainedConfig):
|
||||
```
|
||||
|
||||
As três coisas importantes a serem lembradas ao escrever sua própria configuração são:
|
||||
- você tem que herdar de `PretrainedConfig`,
|
||||
- o `__init__` do seu `PretrainedConfig` deve aceitar quaisquer kwargs,
|
||||
- você tem que herdar de `PreTrainedConfig`,
|
||||
- o `__init__` do seu `PreTrainedConfig` deve aceitar quaisquer kwargs,
|
||||
- esses `kwargs` precisam ser passados para a superclasse `__init__`.
|
||||
|
||||
A herança é para garantir que você obtenha todas as funcionalidades da biblioteca 🤗 Transformers, enquanto as outras duas
|
||||
restrições vêm do fato de um `PretrainedConfig` ter mais campos do que os que você está configurando. Ao recarregar um
|
||||
restrições vêm do fato de um `PreTrainedConfig` ter mais campos do que os que você está configurando. Ao recarregar um
|
||||
config com o método `from_pretrained`, esses campos precisam ser aceitos pelo seu config e então enviados para a
|
||||
superclasse.
|
||||
|
||||
@ -102,7 +102,7 @@ método `from_pretrained`:
|
||||
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
|
||||
```
|
||||
|
||||
Você também pode usar qualquer outro método da classe [`PretrainedConfig`], como [`~PretrainedConfig.push_to_hub`] para
|
||||
Você também pode usar qualquer outro método da classe [`PreTrainedConfig`], como [`~PreTrainedConfig.push_to_hub`] para
|
||||
carregar diretamente sua configuração para o Hub.
|
||||
|
||||
## Escrevendo um modelo customizado
|
||||
|
@ -84,19 +84,19 @@ DistilBertConfig {
|
||||
}
|
||||
```
|
||||
|
||||
预训练模型的属性可以在 [`~PretrainedConfig.from_pretrained`] 函数中进行修改:
|
||||
预训练模型的属性可以在 [`~PreTrainedConfig.from_pretrained`] 函数中进行修改:
|
||||
|
||||
```py
|
||||
>>> my_config = DistilBertConfig.from_pretrained("distilbert/distilbert-base-uncased", activation="relu", attention_dropout=0.4)
|
||||
```
|
||||
|
||||
当你对模型配置满意时,可以使用 [`~PretrainedConfig.save_pretrained`] 来保存配置。你的配置文件将以 JSON 文件的形式存储在指定的保存目录中:
|
||||
当你对模型配置满意时,可以使用 [`~PreTrainedConfig.save_pretrained`] 来保存配置。你的配置文件将以 JSON 文件的形式存储在指定的保存目录中:
|
||||
|
||||
```py
|
||||
>>> my_config.save_pretrained(save_directory="./your_model_save_path")
|
||||
```
|
||||
|
||||
要重用配置文件,请使用 [`~PretrainedConfig.from_pretrained`] 进行加载:
|
||||
要重用配置文件,请使用 [`~PreTrainedConfig.from_pretrained`] 进行加载:
|
||||
|
||||
```py
|
||||
>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")
|
||||
|
@ -29,11 +29,11 @@ rendered properly in your Markdown viewer.
|
||||
我们将采用一些我们可能想要调整的 ResNet 类的参数举例。不同的配置将为我们提供不同类型可能的 ResNet 模型。在确认其中一些参数的有效性后,我们只需存储这些参数。
|
||||
|
||||
```python
|
||||
from transformers import PretrainedConfig
|
||||
from transformers import PreTrainedConfig
|
||||
from typing import List
|
||||
|
||||
|
||||
class ResnetConfig(PretrainedConfig):
|
||||
class ResnetConfig(PreTrainedConfig):
|
||||
model_type = "resnet"
|
||||
|
||||
def __init__(
|
||||
@ -67,11 +67,11 @@ class ResnetConfig(PretrainedConfig):
|
||||
```
|
||||
|
||||
编写自定义配置时需要记住的三个重要事项如下:
|
||||
- 必须继承自 `PretrainedConfig`,
|
||||
- `PretrainedConfig` 的 `__init__` 方法必须接受任何 kwargs,
|
||||
- 必须继承自 `PreTrainedConfig`,
|
||||
- `PreTrainedConfig` 的 `__init__` 方法必须接受任何 kwargs,
|
||||
- 这些 `kwargs` 需要传递给超类的 `__init__` 方法。
|
||||
|
||||
继承是为了确保你获得来自 🤗 Transformers 库的所有功能,而另外两个约束源于 `PretrainedConfig` 的字段比你设置的字段多。在使用 `from_pretrained` 方法重新加载配置时,这些字段需要被你的配置接受,然后传递给超类。
|
||||
继承是为了确保你获得来自 🤗 Transformers 库的所有功能,而另外两个约束源于 `PreTrainedConfig` 的字段比你设置的字段多。在使用 `from_pretrained` 方法重新加载配置时,这些字段需要被你的配置接受,然后传递给超类。
|
||||
|
||||
为你的配置定义 `model_type`(此处为 `model_type="resnet"`)不是必须的,除非你想使用自动类注册你的模型(请参阅最后一节)。
|
||||
|
||||
@ -88,7 +88,7 @@ resnet50d_config.save_pretrained("custom-resnet")
|
||||
resnet50d_config = ResnetConfig.from_pretrained("custom-resnet")
|
||||
```
|
||||
|
||||
你还可以使用 [`PretrainedConfig`] 类的任何其他方法,例如 [`~PretrainedConfig.push_to_hub`],直接将配置上传到 Hub。
|
||||
你还可以使用 [`PreTrainedConfig`] 类的任何其他方法,例如 [`~PreTrainedConfig.push_to_hub`],直接将配置上传到 Hub。
|
||||
|
||||
## 编写自定义模型
|
||||
|
||||
|
@ -16,13 +16,13 @@ rendered properly in your Markdown viewer.
|
||||
|
||||
# Configuration
|
||||
|
||||
基类[`PretrainedConfig`]实现了从本地文件或目录加载/保存配置的常见方法,或下载库提供的预训练模型配置(从HuggingFace的AWS S3库中下载)。
|
||||
基类[`PreTrainedConfig`]实现了从本地文件或目录加载/保存配置的常见方法,或下载库提供的预训练模型配置(从HuggingFace的AWS S3库中下载)。
|
||||
|
||||
每个派生的配置类都实现了特定于模型的属性。所有配置类中共同存在的属性有:`hidden_size`、`num_attention_heads` 和 `num_hidden_layers`。文本模型进一步添加了 `vocab_size`。
|
||||
|
||||
|
||||
## PretrainedConfig
|
||||
## PreTrainedConfig
|
||||
|
||||
[[autodoc]] PretrainedConfig
|
||||
[[autodoc]] PreTrainedConfig
|
||||
- push_to_hub
|
||||
- all
|
||||
|
@ -17,7 +17,7 @@ from transformers import (
|
||||
AutoModelForTokenClassification,
|
||||
AutoModelWithLMHead,
|
||||
AutoTokenizer,
|
||||
PretrainedConfig,
|
||||
PreTrainedConfig,
|
||||
PreTrainedTokenizer,
|
||||
is_torch_available,
|
||||
)
|
||||
@ -93,7 +93,7 @@ class BaseTransformer(pl.LightningModule):
|
||||
**config_kwargs,
|
||||
)
|
||||
else:
|
||||
self.config: PretrainedConfig = config
|
||||
self.config: PreTrainedConfig = config
|
||||
|
||||
extra_model_params = ("encoder_layerdrop", "decoder_layerdrop", "dropout", "attention_dropout")
|
||||
for p in extra_model_params:
|
||||
|
@ -5,19 +5,19 @@
|
||||
# modular_duplicated_method.py file directly. One of our CI enforces this.
|
||||
# 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...modeling_rope_utils import rope_config_validation
|
||||
|
||||
|
||||
class DuplicatedMethodConfig(PretrainedConfig):
|
||||
class DuplicatedMethodConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`DuplicatedMethodModel`]. It is used to instantiate an DuplicatedMethod
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
defaults will yield a similar configuration to that of the DuplicatedMethod-7B.
|
||||
e.g. [meta-duplicated_method/DuplicatedMethod-2-7b-hf](https://huggingface.co/meta-duplicated_method/DuplicatedMethod-2-7b-hf)
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -5,19 +5,19 @@
|
||||
# modular_my_new_model.py file directly. One of our CI enforces this.
|
||||
# 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...modeling_rope_utils import rope_config_validation
|
||||
|
||||
|
||||
class MyNewModelConfig(PretrainedConfig):
|
||||
class MyNewModelConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`MyNewModelModel`]. It is used to instantiate an MyNewModel
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
defaults will yield a similar configuration to that of the MyNewModel-7B.
|
||||
e.g. [meta-my_new_model/MyNewModel-2-7b-hf](https://huggingface.co/meta-my_new_model/MyNewModel-2-7b-hf)
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -5,18 +5,18 @@
|
||||
# modular_my_new_model2.py file directly. One of our CI enforces this.
|
||||
# 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...modeling_rope_utils import rope_config_validation
|
||||
|
||||
|
||||
class MyNewModel2Config(PretrainedConfig):
|
||||
class MyNewModel2Config(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`GemmaModel`]. It is used to instantiate an Gemma
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
defaults will yield a similar configuration to that of the Gemma-7B.
|
||||
e.g. [google/gemma-7b](https://huggingface.co/google/gemma-7b)
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 256000):
|
||||
Vocabulary size of the Gemma model. Defines the number of different tokens that can be represented by the
|
||||
|
@ -6,17 +6,17 @@
|
||||
# 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
|
||||
# Example where we only want to overwrite the defaults of an init
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
|
||||
|
||||
class NewModelConfig(PretrainedConfig):
|
||||
class NewModelConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`NewModelModel`]. It is used to instantiate an NewModel
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
defaults will yield a similar configuration to that of the NewModel-7B.
|
||||
e.g. [google/new_model-7b](https://huggingface.co/google/new_model-7b)
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 256000):
|
||||
Vocabulary size of the NewModel model. Defines the number of different tokens that can be represented by the
|
||||
|
@ -9,8 +9,8 @@ class MyNewModelConfig(LlamaConfig):
|
||||
defaults will yield a similar configuration to that of the MyNewModel-7B.
|
||||
e.g. [meta-my_new_model/MyNewModel-2-7b-hf](https://huggingface.co/meta-my_new_model/MyNewModel-2-7b-hf)
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -9,8 +9,8 @@ class MyNewModel2Config(LlamaConfig):
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
defaults will yield a similar configuration to that of the Gemma-7B.
|
||||
e.g. [google/gemma-7b](https://huggingface.co/google/gemma-7b)
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 256000):
|
||||
Vocabulary size of the Gemma model. Defines the number of different tokens that can be represented by the
|
||||
|
@ -51,7 +51,7 @@ from transformers import (
|
||||
DataCollatorWithPadding,
|
||||
EvalPrediction,
|
||||
HfArgumentParser,
|
||||
PretrainedConfig,
|
||||
PreTrainedConfig,
|
||||
Trainer,
|
||||
TrainingArguments,
|
||||
default_data_collator,
|
||||
@ -429,7 +429,7 @@ def main():
|
||||
# Some models have set the order of the labels to use, so let's make sure we do use it.
|
||||
label_to_id = None
|
||||
if (
|
||||
model.config.label2id != PretrainedConfig(num_labels=num_labels).label2id
|
||||
model.config.label2id != PreTrainedConfig(num_labels=num_labels).label2id
|
||||
and data_args.task_name is not None
|
||||
and not is_regression
|
||||
):
|
||||
|
@ -53,7 +53,7 @@ from transformers import (
|
||||
AutoModelForSequenceClassification,
|
||||
AutoTokenizer,
|
||||
DataCollatorWithPadding,
|
||||
PretrainedConfig,
|
||||
PreTrainedConfig,
|
||||
SchedulerType,
|
||||
default_data_collator,
|
||||
get_scheduler,
|
||||
@ -367,7 +367,7 @@ def main():
|
||||
# Some models have set the order of the labels to use, so let's make sure we do use it.
|
||||
label_to_id = None
|
||||
if (
|
||||
model.config.label2id != PretrainedConfig(num_labels=num_labels).label2id
|
||||
model.config.label2id != PreTrainedConfig(num_labels=num_labels).label2id
|
||||
and args.task_name is not None
|
||||
and not is_regression
|
||||
):
|
||||
|
@ -48,7 +48,7 @@ from transformers import (
|
||||
AutoTokenizer,
|
||||
DataCollatorForTokenClassification,
|
||||
HfArgumentParser,
|
||||
PretrainedConfig,
|
||||
PreTrainedConfig,
|
||||
PreTrainedTokenizerFast,
|
||||
Trainer,
|
||||
TrainingArguments,
|
||||
@ -413,7 +413,7 @@ def main():
|
||||
)
|
||||
|
||||
# Model has labels -> use them.
|
||||
if model.config.label2id != PretrainedConfig(num_labels=num_labels).label2id:
|
||||
if model.config.label2id != PreTrainedConfig(num_labels=num_labels).label2id:
|
||||
if sorted(model.config.label2id.keys()) == sorted(label_list):
|
||||
# Reorganize `label_list` to match the ordering of the model.
|
||||
if labels_are_int:
|
||||
|
@ -57,7 +57,7 @@ from transformers import (
|
||||
AutoModelForTokenClassification,
|
||||
AutoTokenizer,
|
||||
DataCollatorForTokenClassification,
|
||||
PretrainedConfig,
|
||||
PreTrainedConfig,
|
||||
SchedulerType,
|
||||
default_data_collator,
|
||||
get_scheduler,
|
||||
@ -454,7 +454,7 @@ def main():
|
||||
model.resize_token_embeddings(len(tokenizer))
|
||||
|
||||
# Model has labels -> use them.
|
||||
if model.config.label2id != PretrainedConfig(num_labels=num_labels).label2id:
|
||||
if model.config.label2id != PreTrainedConfig(num_labels=num_labels).label2id:
|
||||
if sorted(model.config.label2id.keys()) == sorted(label_list):
|
||||
# Reorganize `label_list` to match the ordering of the model.
|
||||
if labels_are_int:
|
||||
|
@ -59,7 +59,7 @@ logger = logging.get_logger(__name__) # pylint: disable=invalid-name
|
||||
_import_structure = {
|
||||
"audio_utils": [],
|
||||
"commands": [],
|
||||
"configuration_utils": ["PretrainedConfig"],
|
||||
"configuration_utils": ["PreTrainedConfig", "PretrainedConfig"],
|
||||
"convert_slow_tokenizers_checkpoints_to_fast": [],
|
||||
"data": [
|
||||
"DataProcessor",
|
||||
@ -491,6 +491,7 @@ if TYPE_CHECKING:
|
||||
from .cache_utils import StaticCache as StaticCache
|
||||
from .cache_utils import StaticLayer as StaticLayer
|
||||
from .cache_utils import StaticSlidingWindowLayer as StaticSlidingWindowLayer
|
||||
from .configuration_utils import PreTrainedConfig as PreTrainedConfig
|
||||
from .configuration_utils import PretrainedConfig as PretrainedConfig
|
||||
from .convert_slow_tokenizer import SLOW_TO_FAST_CONVERTERS as SLOW_TO_FAST_CONVERTERS
|
||||
from .convert_slow_tokenizer import convert_slow_tokenizer as convert_slow_tokenizer
|
||||
|
@ -4,7 +4,7 @@ from typing import Any, Optional
|
||||
|
||||
import torch
|
||||
|
||||
from .configuration_utils import PretrainedConfig
|
||||
from .configuration_utils import PreTrainedConfig
|
||||
from .utils import (
|
||||
is_hqq_available,
|
||||
is_quanto_greater,
|
||||
@ -923,7 +923,7 @@ class DynamicCache(Cache):
|
||||
`map(gather_map, zip(*caches))`, i.e. each item in the iterable contains the key and value states
|
||||
for a layer gathered across replicas by torch.distributed (shape=[global batch size, num_heads, seq_len, head_dim]).
|
||||
Note: it needs to be the 1st arg as well to work correctly
|
||||
config (`PretrainedConfig`, *optional*):
|
||||
config (`PreTrainedConfig`, *optional*):
|
||||
The config of the model for which this Cache will be used. If passed, it will be used to check for sliding
|
||||
or hybrid layer structure, greatly reducing the memory requirement of the cached tensors to
|
||||
`[batch_size, num_heads, min(seq_len, sliding_window), head_dim]`.
|
||||
@ -953,7 +953,7 @@ class DynamicCache(Cache):
|
||||
def __init__(
|
||||
self,
|
||||
ddp_cache_data: Optional[Iterable[tuple[torch.Tensor, torch.Tensor]]] = None,
|
||||
config: Optional[PretrainedConfig] = None,
|
||||
config: Optional[PreTrainedConfig] = None,
|
||||
offloading: bool = False,
|
||||
offload_only_non_sliding: bool = False,
|
||||
):
|
||||
@ -1036,7 +1036,7 @@ class StaticCache(Cache):
|
||||
See `Cache` for details on common methods that are implemented by all cache classes.
|
||||
|
||||
Args:
|
||||
config (`PretrainedConfig`):
|
||||
config (`PreTrainedConfig`):
|
||||
The config of the model for which this Cache will be used. It will be used to check for sliding
|
||||
or hybrid layer structure, and initialize each layer accordingly.
|
||||
max_cache_len (`int`):
|
||||
@ -1070,7 +1070,7 @@ class StaticCache(Cache):
|
||||
# Pass-in kwargs as well to avoid crashing for BC (it used more arguments before)
|
||||
def __init__(
|
||||
self,
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
max_cache_len: int,
|
||||
offloading: bool = False,
|
||||
offload_only_non_sliding: bool = True,
|
||||
@ -1124,7 +1124,7 @@ class QuantizedCache(Cache):
|
||||
Args:
|
||||
backend (`str`):
|
||||
The quantization backend to use. One of `("quanto", "hqq").
|
||||
config (`PretrainedConfig`):
|
||||
config (`PreTrainedConfig`):
|
||||
The config of the model for which this Cache will be used.
|
||||
nbits (`int`, *optional*, defaults to 4):
|
||||
The number of bits for quantization.
|
||||
@ -1141,7 +1141,7 @@ class QuantizedCache(Cache):
|
||||
def __init__(
|
||||
self,
|
||||
backend: str,
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
nbits: int = 4,
|
||||
axis_key: int = 0,
|
||||
axis_value: int = 0,
|
||||
@ -1400,7 +1400,7 @@ class OffloadedCache(DynamicCache):
|
||||
|
||||
|
||||
class OffloadedStaticCache(StaticCache):
|
||||
def __init__(self, config: PretrainedConfig, max_cache_len: int, *args, **kwargs):
|
||||
def __init__(self, config: PreTrainedConfig, max_cache_len: int, *args, **kwargs):
|
||||
logger.warning_once(
|
||||
"`OffloadedStaticCache` is deprecated and will be removed in version v4.59 "
|
||||
"Use `StaticCache(..., offloading=True)` instead"
|
||||
@ -1409,7 +1409,7 @@ class OffloadedStaticCache(StaticCache):
|
||||
|
||||
|
||||
class SlidingWindowCache(StaticCache):
|
||||
def __init__(self, config: PretrainedConfig, max_cache_len: int, *args, **kwargs):
|
||||
def __init__(self, config: PreTrainedConfig, max_cache_len: int, *args, **kwargs):
|
||||
logger.warning_once(
|
||||
"`SlidingWindowCache` is deprecated and will be removed in version v4.59 "
|
||||
"Use `StaticCache(...)` instead which will correctly infer the type of each layer."
|
||||
@ -1418,7 +1418,7 @@ class SlidingWindowCache(StaticCache):
|
||||
|
||||
|
||||
class HybridCache(StaticCache):
|
||||
def __init__(self, config: PretrainedConfig, max_cache_len: int, *args, **kwargs):
|
||||
def __init__(self, config: PreTrainedConfig, max_cache_len: int, *args, **kwargs):
|
||||
logger.warning_once(
|
||||
"`HybridCache` is deprecated and will be removed in version v4.59 "
|
||||
"Use `StaticCache(...)` instead which will correctly infer the type of each layer."
|
||||
@ -1427,7 +1427,7 @@ class HybridCache(StaticCache):
|
||||
|
||||
|
||||
class HybridChunkedCache(StaticCache):
|
||||
def __init__(self, config: PretrainedConfig, max_cache_len: int, *args, **kwargs):
|
||||
def __init__(self, config: PreTrainedConfig, max_cache_len: int, *args, **kwargs):
|
||||
logger.warning_once(
|
||||
"`HybridChunkedCache` is deprecated and will be removed in version v4.59 "
|
||||
"Use `StaticCache(...)` instead which will correctly infer the type of each layer."
|
||||
@ -1436,7 +1436,7 @@ class HybridChunkedCache(StaticCache):
|
||||
|
||||
|
||||
class OffloadedHybridCache(StaticCache):
|
||||
def __init__(self, config: PretrainedConfig, max_cache_len: int, *args, **kwargs):
|
||||
def __init__(self, config: PreTrainedConfig, max_cache_len: int, *args, **kwargs):
|
||||
logger.warning_once(
|
||||
"`OffloadedHybridCache` is deprecated and will be removed in version v4.59 "
|
||||
"Use `StaticCache(..., offload=True)` instead which will correctly infer the type of each layer."
|
||||
@ -1447,7 +1447,7 @@ class OffloadedHybridCache(StaticCache):
|
||||
class QuantoQuantizedCache(QuantizedCache):
|
||||
def __init__(
|
||||
self,
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
nbits: int = 4,
|
||||
axis_key: int = 0,
|
||||
axis_value: int = 0,
|
||||
@ -1464,7 +1464,7 @@ class QuantoQuantizedCache(QuantizedCache):
|
||||
class HQQQuantizedCache(QuantizedCache):
|
||||
def __init__(
|
||||
self,
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
nbits: int = 4,
|
||||
axis_key: int = 0,
|
||||
axis_value: int = 0,
|
||||
|
@ -46,11 +46,11 @@ if TYPE_CHECKING:
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
# type hinting: specifying the type of config class that inherits from PretrainedConfig
|
||||
SpecificPretrainedConfigType = TypeVar("SpecificPretrainedConfigType", bound="PretrainedConfig")
|
||||
# type hinting: specifying the type of config class that inherits from PreTrainedConfig
|
||||
SpecificPreTrainedConfigType = TypeVar("SpecificPreTrainedConfigType", bound="PreTrainedConfig")
|
||||
|
||||
|
||||
class PretrainedConfig(PushToHubMixin):
|
||||
class PreTrainedConfig(PushToHubMixin):
|
||||
# no-format
|
||||
r"""
|
||||
Base class for all configuration classes. Handles a few parameters common to all models' configurations as well as
|
||||
@ -70,7 +70,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
- **has_no_defaults_at_init** (`bool`) -- Whether the config class can be initialized without providing input arguments.
|
||||
Some configurations requires inputs to be defined at init and have no default values, usually these are composite configs,
|
||||
(but not necessarily) such as [`~transformers.EncoderDecoderConfig`] or [`~RagConfig`]. They have to be initialized from
|
||||
two or more configs of type [`~transformers.PretrainedConfig`].
|
||||
two or more configs of type [`~transformers.PreTrainedConfig`].
|
||||
- **keys_to_ignore_at_inference** (`list[str]`) -- A list of keys to ignore by default when looking at dictionary
|
||||
outputs of the model during inference.
|
||||
- **attribute_map** (`dict[str, str]`) -- A dict that maps model specific attribute names to the standardized
|
||||
@ -186,7 +186,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
|
||||
model_type: str = ""
|
||||
base_config_key: str = ""
|
||||
sub_configs: dict[str, type["PretrainedConfig"]] = {}
|
||||
sub_configs: dict[str, type["PreTrainedConfig"]] = {}
|
||||
has_no_defaults_at_init: bool = False
|
||||
attribute_map: dict[str, str] = {}
|
||||
base_model_tp_plan: Optional[dict[str, Any]] = None
|
||||
@ -432,7 +432,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
def save_pretrained(self, save_directory: Union[str, os.PathLike], push_to_hub: bool = False, **kwargs):
|
||||
"""
|
||||
Save a configuration object to the directory `save_directory`, so that it can be re-loaded using the
|
||||
[`~PretrainedConfig.from_pretrained`] class method.
|
||||
[`~PreTrainedConfig.from_pretrained`] class method.
|
||||
|
||||
Args:
|
||||
save_directory (`str` or `os.PathLike`):
|
||||
@ -522,7 +522,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
|
||||
@classmethod
|
||||
def from_pretrained(
|
||||
cls: type[SpecificPretrainedConfigType],
|
||||
cls: type[SpecificPreTrainedConfigType],
|
||||
pretrained_model_name_or_path: Union[str, os.PathLike],
|
||||
cache_dir: Optional[Union[str, os.PathLike]] = None,
|
||||
force_download: bool = False,
|
||||
@ -530,9 +530,9 @@ class PretrainedConfig(PushToHubMixin):
|
||||
token: Optional[Union[str, bool]] = None,
|
||||
revision: str = "main",
|
||||
**kwargs,
|
||||
) -> SpecificPretrainedConfigType:
|
||||
) -> SpecificPreTrainedConfigType:
|
||||
r"""
|
||||
Instantiate a [`PretrainedConfig`] (or a derived class) from a pretrained model configuration.
|
||||
Instantiate a [`PreTrainedConfig`] (or a derived class) from a pretrained model configuration.
|
||||
|
||||
Args:
|
||||
pretrained_model_name_or_path (`str` or `os.PathLike`):
|
||||
@ -541,7 +541,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
- a string, the *model id* of a pretrained model configuration hosted inside a model repo on
|
||||
huggingface.co.
|
||||
- a path to a *directory* containing a configuration file saved using the
|
||||
[`~PretrainedConfig.save_pretrained`] method, e.g., `./my_model_directory/`.
|
||||
[`~PreTrainedConfig.save_pretrained`] method, e.g., `./my_model_directory/`.
|
||||
- a path or url to a saved configuration JSON *file*, e.g., `./my_model_directory/configuration.json`.
|
||||
cache_dir (`str` or `os.PathLike`, *optional*):
|
||||
Path to a directory in which a downloaded pretrained model configuration should be cached if the
|
||||
@ -581,12 +581,12 @@ class PretrainedConfig(PushToHubMixin):
|
||||
by the `return_unused_kwargs` keyword parameter.
|
||||
|
||||
Returns:
|
||||
[`PretrainedConfig`]: The configuration object instantiated from this pretrained model.
|
||||
[`PreTrainedConfig`]: The configuration object instantiated from this pretrained model.
|
||||
|
||||
Examples:
|
||||
|
||||
```python
|
||||
# We can't instantiate directly the base class *PretrainedConfig* so let's show the examples on a
|
||||
# We can't instantiate directly the base class *PreTrainedConfig* so let's show the examples on a
|
||||
# derived class: BertConfig
|
||||
config = BertConfig.from_pretrained(
|
||||
"google-bert/bert-base-uncased"
|
||||
@ -636,7 +636,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
) -> tuple[dict[str, Any], dict[str, Any]]:
|
||||
"""
|
||||
From a `pretrained_model_name_or_path`, resolve to a dictionary of parameters, to be used for instantiating a
|
||||
[`PretrainedConfig`] using `from_dict`.
|
||||
[`PreTrainedConfig`] using `from_dict`.
|
||||
|
||||
Parameters:
|
||||
pretrained_model_name_or_path (`str` or `os.PathLike`):
|
||||
@ -761,20 +761,20 @@ class PretrainedConfig(PushToHubMixin):
|
||||
|
||||
@classmethod
|
||||
def from_dict(
|
||||
cls: type[SpecificPretrainedConfigType], config_dict: dict[str, Any], **kwargs
|
||||
) -> SpecificPretrainedConfigType:
|
||||
cls: type[SpecificPreTrainedConfigType], config_dict: dict[str, Any], **kwargs
|
||||
) -> SpecificPreTrainedConfigType:
|
||||
"""
|
||||
Instantiates a [`PretrainedConfig`] from a Python dictionary of parameters.
|
||||
Instantiates a [`PreTrainedConfig`] from a Python dictionary of parameters.
|
||||
|
||||
Args:
|
||||
config_dict (`dict[str, Any]`):
|
||||
Dictionary that will be used to instantiate the configuration object. Such a dictionary can be
|
||||
retrieved from a pretrained checkpoint by leveraging the [`~PretrainedConfig.get_config_dict`] method.
|
||||
retrieved from a pretrained checkpoint by leveraging the [`~PreTrainedConfig.get_config_dict`] method.
|
||||
kwargs (`dict[str, Any]`):
|
||||
Additional parameters from which to initialize the configuration object.
|
||||
|
||||
Returns:
|
||||
[`PretrainedConfig`]: The configuration object instantiated from those parameters.
|
||||
[`PreTrainedConfig`]: The configuration object instantiated from those parameters.
|
||||
"""
|
||||
return_unused_kwargs = kwargs.pop("return_unused_kwargs", False)
|
||||
# Those arguments may be passed along for our internal telemetry.
|
||||
@ -815,7 +815,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
current_attr = getattr(config, key)
|
||||
# To authorize passing a custom subconfig as kwarg in models that have nested configs.
|
||||
# We need to update only custom kwarg values instead and keep other attributes in subconfig.
|
||||
if isinstance(current_attr, PretrainedConfig) and isinstance(value, dict):
|
||||
if isinstance(current_attr, PreTrainedConfig) and isinstance(value, dict):
|
||||
current_attr_updated = current_attr.to_dict()
|
||||
current_attr_updated.update(value)
|
||||
value = current_attr.__class__(**current_attr_updated)
|
||||
@ -833,17 +833,17 @@ class PretrainedConfig(PushToHubMixin):
|
||||
|
||||
@classmethod
|
||||
def from_json_file(
|
||||
cls: type[SpecificPretrainedConfigType], json_file: Union[str, os.PathLike]
|
||||
) -> SpecificPretrainedConfigType:
|
||||
cls: type[SpecificPreTrainedConfigType], json_file: Union[str, os.PathLike]
|
||||
) -> SpecificPreTrainedConfigType:
|
||||
"""
|
||||
Instantiates a [`PretrainedConfig`] from the path to a JSON file of parameters.
|
||||
Instantiates a [`PreTrainedConfig`] from the path to a JSON file of parameters.
|
||||
|
||||
Args:
|
||||
json_file (`str` or `os.PathLike`):
|
||||
Path to the JSON file containing the parameters.
|
||||
|
||||
Returns:
|
||||
[`PretrainedConfig`]: The configuration object instantiated from that JSON file.
|
||||
[`PreTrainedConfig`]: The configuration object instantiated from that JSON file.
|
||||
|
||||
"""
|
||||
config_dict = cls._dict_from_json_file(json_file)
|
||||
@ -856,7 +856,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
return json.loads(text)
|
||||
|
||||
def __eq__(self, other):
|
||||
return isinstance(other, PretrainedConfig) and (self.__dict__ == other.__dict__)
|
||||
return isinstance(other, PreTrainedConfig) and (self.__dict__ == other.__dict__)
|
||||
|
||||
def __repr__(self):
|
||||
return f"{self.__class__.__name__} {self.to_json_string()}"
|
||||
@ -876,7 +876,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
config_dict = self.to_dict()
|
||||
|
||||
# Get the default config dict (from a fresh PreTrainedConfig instance)
|
||||
default_config_dict = PretrainedConfig().to_dict()
|
||||
default_config_dict = PreTrainedConfig().to_dict()
|
||||
|
||||
# get class specific config dict
|
||||
class_config_dict = self.__class__().to_dict() if not self.has_no_defaults_at_init else {}
|
||||
@ -887,7 +887,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
# except always keep the 'config' attribute.
|
||||
for key, value in config_dict.items():
|
||||
if (
|
||||
isinstance(getattr(self, key, None), PretrainedConfig)
|
||||
isinstance(getattr(self, key, None), PreTrainedConfig)
|
||||
and key in class_config_dict
|
||||
and isinstance(class_config_dict[key], dict)
|
||||
or key in self.sub_configs
|
||||
@ -940,7 +940,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
|
||||
for key, value in output.items():
|
||||
# Deal with nested configs like CLIP
|
||||
if isinstance(value, PretrainedConfig):
|
||||
if isinstance(value, PreTrainedConfig):
|
||||
value = value.to_dict()
|
||||
del value["transformers_version"]
|
||||
|
||||
@ -964,7 +964,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
|
||||
Args:
|
||||
use_diff (`bool`, *optional*, defaults to `True`):
|
||||
If set to `True`, only the difference between the config instance and the default `PretrainedConfig()`
|
||||
If set to `True`, only the difference between the config instance and the default `PreTrainedConfig()`
|
||||
is serialized to JSON string.
|
||||
|
||||
Returns:
|
||||
@ -984,7 +984,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
json_file_path (`str` or `os.PathLike`):
|
||||
Path to the JSON file in which this configuration instance's parameters will be saved.
|
||||
use_diff (`bool`, *optional*, defaults to `True`):
|
||||
If set to `True`, only the difference between the config instance and the default `PretrainedConfig()`
|
||||
If set to `True`, only the difference between the config instance and the default `PreTrainedConfig()`
|
||||
is serialized to JSON file.
|
||||
"""
|
||||
with open(json_file_path, "w", encoding="utf-8") as writer:
|
||||
@ -1137,7 +1137,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
|
||||
def _get_non_default_generation_parameters(self) -> dict[str, Any]:
|
||||
"""
|
||||
Gets the non-default generation parameters on the PretrainedConfig instance
|
||||
Gets the non-default generation parameters on the PreTrainedConfig instance
|
||||
"""
|
||||
non_default_generation_parameters = {}
|
||||
decoder_attribute_name = None
|
||||
@ -1179,7 +1179,7 @@ class PretrainedConfig(PushToHubMixin):
|
||||
|
||||
return non_default_generation_parameters
|
||||
|
||||
def get_text_config(self, decoder=None, encoder=None) -> "PretrainedConfig":
|
||||
def get_text_config(self, decoder=None, encoder=None) -> "PreTrainedConfig":
|
||||
"""
|
||||
Returns the text config related to the text input (encoder) or text output (decoder) of the model. The
|
||||
`decoder` and `encoder` input arguments can be used to specify which end of the model we are interested in,
|
||||
@ -1335,7 +1335,7 @@ def recursive_diff_dict(dict_a, dict_b, config_obj=None):
|
||||
default = config_obj.__class__().to_dict() if config_obj is not None else {}
|
||||
for key, value in dict_a.items():
|
||||
obj_value = getattr(config_obj, str(key), None)
|
||||
if isinstance(obj_value, PretrainedConfig) and key in dict_b and isinstance(dict_b[key], dict):
|
||||
if isinstance(obj_value, PreTrainedConfig) and key in dict_b and isinstance(dict_b[key], dict):
|
||||
diff_value = recursive_diff_dict(value, dict_b[key], config_obj=obj_value)
|
||||
diff[key] = diff_value
|
||||
elif key not in dict_b or (value != default[key]):
|
||||
@ -1343,13 +1343,17 @@ def recursive_diff_dict(dict_a, dict_b, config_obj=None):
|
||||
return diff
|
||||
|
||||
|
||||
PretrainedConfig.push_to_hub = copy_func(PretrainedConfig.push_to_hub)
|
||||
if PretrainedConfig.push_to_hub.__doc__ is not None:
|
||||
PretrainedConfig.push_to_hub.__doc__ = PretrainedConfig.push_to_hub.__doc__.format(
|
||||
PreTrainedConfig.push_to_hub = copy_func(PreTrainedConfig.push_to_hub)
|
||||
if PreTrainedConfig.push_to_hub.__doc__ is not None:
|
||||
PreTrainedConfig.push_to_hub.__doc__ = PreTrainedConfig.push_to_hub.__doc__.format(
|
||||
object="config", object_class="AutoConfig", object_files="configuration file"
|
||||
)
|
||||
|
||||
|
||||
# The alias is only here for BC - we did not have the correct CamelCasing before
|
||||
PretrainedConfig = PreTrainedConfig
|
||||
|
||||
|
||||
ALLOWED_LAYER_TYPES = (
|
||||
"full_attention",
|
||||
"sliding_attention",
|
||||
|
@ -613,7 +613,7 @@ def custom_object_save(obj: Any, folder: Union[str, os.PathLike], config: Option
|
||||
Args:
|
||||
obj (`Any`): The object for which to save the module files.
|
||||
folder (`str` or `os.PathLike`): The folder where to save.
|
||||
config (`PretrainedConfig` or dictionary, `optional`):
|
||||
config (`PreTrainedConfig` or dictionary, `optional`):
|
||||
A config in which to register the auto_map corresponding to this custom object.
|
||||
|
||||
Returns:
|
||||
|
@ -23,7 +23,7 @@ from dataclasses import dataclass, is_dataclass
|
||||
from typing import TYPE_CHECKING, Any, Callable, Optional, Union
|
||||
|
||||
from .. import __version__
|
||||
from ..configuration_utils import PretrainedConfig
|
||||
from ..configuration_utils import PreTrainedConfig
|
||||
from ..utils import (
|
||||
GENERATION_CONFIG_NAME,
|
||||
ExplicitEnum,
|
||||
@ -1101,13 +1101,13 @@ class GenerationConfig(PushToHubMixin):
|
||||
writer.write(self.to_json_string(use_diff=use_diff))
|
||||
|
||||
@classmethod
|
||||
def from_model_config(cls, model_config: PretrainedConfig) -> "GenerationConfig":
|
||||
def from_model_config(cls, model_config: PreTrainedConfig) -> "GenerationConfig":
|
||||
"""
|
||||
Instantiates a [`GenerationConfig`] from a [`PretrainedConfig`]. This function is useful to convert legacy
|
||||
[`PretrainedConfig`] objects, which may contain generation parameters, into a stand-alone [`GenerationConfig`].
|
||||
Instantiates a [`GenerationConfig`] from a [`PreTrainedConfig`]. This function is useful to convert legacy
|
||||
[`PreTrainedConfig`] objects, which may contain generation parameters, into a stand-alone [`GenerationConfig`].
|
||||
|
||||
Args:
|
||||
model_config (`PretrainedConfig`):
|
||||
model_config (`PreTrainedConfig`):
|
||||
The model config that will be used to instantiate the generation config.
|
||||
|
||||
Returns:
|
||||
|
@ -18,14 +18,14 @@ from typing import Optional, Union
|
||||
|
||||
import torch
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...generation.configuration_utils import GenerationConfig
|
||||
from ...utils.metrics import attach_tracer, traced
|
||||
from .cache_manager import CacheAllocator, FullAttentionCacheAllocator, SlidingAttentionCacheAllocator
|
||||
from .requests import get_device_and_memory_breakdown, logger
|
||||
|
||||
|
||||
def group_layers_by_attn_type(config: PretrainedConfig) -> tuple[list[list[int]], list[str]]:
|
||||
def group_layers_by_attn_type(config: PreTrainedConfig) -> tuple[list[list[int]], list[str]]:
|
||||
"""
|
||||
Group layers depending on the attention mix, according to VLLM's hybrid allocator rules:
|
||||
- Layers in each group need to have the same type of attention
|
||||
@ -119,7 +119,7 @@ class PagedAttentionCache:
|
||||
# TODO: this init is quite long, maybe a refactor is in order
|
||||
def __init__(
|
||||
self,
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
generation_config: GenerationConfig,
|
||||
device: torch.device,
|
||||
dtype: torch.dtype = torch.float16,
|
||||
|
@ -25,7 +25,7 @@ import torch
|
||||
from torch import nn
|
||||
from tqdm import tqdm
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...generation.configuration_utils import GenerationConfig
|
||||
from ...utils.logging import logging
|
||||
from ...utils.metrics import ContinuousBatchProcessorMetrics, attach_tracer, traced
|
||||
@ -140,7 +140,7 @@ class ContinuousBatchProcessor:
|
||||
def __init__(
|
||||
self,
|
||||
cache: PagedAttentionCache,
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
generation_config: GenerationConfig,
|
||||
input_queue: queue.Queue,
|
||||
output_queue: queue.Queue,
|
||||
|
@ -25,7 +25,7 @@ from torch.nn import BCELoss
|
||||
|
||||
from ..modeling_utils import PreTrainedModel
|
||||
from ..utils import ModelOutput, logging
|
||||
from .configuration_utils import PretrainedConfig, WatermarkingConfig
|
||||
from .configuration_utils import PreTrainedConfig, WatermarkingConfig
|
||||
from .logits_process import SynthIDTextWatermarkLogitsProcessor, WatermarkLogitsProcessor
|
||||
|
||||
|
||||
@ -75,7 +75,7 @@ class WatermarkDetector:
|
||||
See [the paper](https://huggingface.co/papers/2306.04634) for more information.
|
||||
|
||||
Args:
|
||||
model_config (`PretrainedConfig`):
|
||||
model_config (`PreTrainedConfig`):
|
||||
The model config that will be used to get model specific arguments used when generating.
|
||||
device (`str`):
|
||||
The device which was used during watermarked text generation.
|
||||
@ -119,7 +119,7 @@ class WatermarkDetector:
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
model_config: PretrainedConfig,
|
||||
model_config: PreTrainedConfig,
|
||||
device: str,
|
||||
watermarking_config: Union[WatermarkingConfig, dict],
|
||||
ignore_repeated_ngrams: bool = False,
|
||||
@ -237,13 +237,13 @@ class WatermarkDetector:
|
||||
return prediction
|
||||
|
||||
|
||||
class BayesianDetectorConfig(PretrainedConfig):
|
||||
class BayesianDetectorConfig(PreTrainedConfig):
|
||||
"""
|
||||
This is the configuration class to store the configuration of a [`BayesianDetectorModel`]. It is used to
|
||||
instantiate a Bayesian Detector model according to the specified arguments.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
watermarking_depth (`int`, *optional*):
|
||||
|
@ -19,7 +19,7 @@ import torch
|
||||
import torch.nn.functional as F
|
||||
|
||||
from .cache_utils import Cache
|
||||
from .configuration_utils import PretrainedConfig
|
||||
from .configuration_utils import PreTrainedConfig
|
||||
from .utils import is_torch_xpu_available, logging
|
||||
from .utils.generic import GeneralInterface
|
||||
from .utils.import_utils import is_torch_flex_attn_available, is_torch_greater_or_equal, is_torchdynamo_compiling
|
||||
@ -662,7 +662,7 @@ def find_packed_sequence_indices(position_ids: torch.Tensor) -> torch.Tensor:
|
||||
|
||||
|
||||
def _preprocess_mask_arguments(
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
input_embeds: torch.Tensor,
|
||||
attention_mask: Optional[Union[torch.Tensor, BlockMask]],
|
||||
cache_position: torch.Tensor,
|
||||
@ -675,7 +675,7 @@ def _preprocess_mask_arguments(
|
||||
key-value length and offsets, and if we should early exit or not.
|
||||
|
||||
Args:
|
||||
config (`PretrainedConfig`):
|
||||
config (`PreTrainedConfig`):
|
||||
The model config.
|
||||
input_embeds (`torch.Tensor`):
|
||||
The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the
|
||||
@ -743,7 +743,7 @@ def _preprocess_mask_arguments(
|
||||
|
||||
|
||||
def create_causal_mask(
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
input_embeds: torch.Tensor,
|
||||
attention_mask: Optional[torch.Tensor],
|
||||
cache_position: torch.Tensor,
|
||||
@ -758,7 +758,7 @@ def create_causal_mask(
|
||||
to what is needed in the `modeling_xxx.py` files).
|
||||
|
||||
Args:
|
||||
config (`PretrainedConfig`):
|
||||
config (`PreTrainedConfig`):
|
||||
The model config.
|
||||
input_embeds (`torch.Tensor`):
|
||||
The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the
|
||||
@ -837,7 +837,7 @@ def create_causal_mask(
|
||||
|
||||
|
||||
def create_sliding_window_causal_mask(
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
input_embeds: torch.Tensor,
|
||||
attention_mask: Optional[torch.Tensor],
|
||||
cache_position: torch.Tensor,
|
||||
@ -853,7 +853,7 @@ def create_sliding_window_causal_mask(
|
||||
`modeling_xxx.py` files).
|
||||
|
||||
Args:
|
||||
config (`PretrainedConfig`):
|
||||
config (`PreTrainedConfig`):
|
||||
The model config.
|
||||
input_embeds (`torch.Tensor`):
|
||||
The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the
|
||||
@ -934,7 +934,7 @@ def create_sliding_window_causal_mask(
|
||||
|
||||
|
||||
def create_chunked_causal_mask(
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
input_embeds: torch.Tensor,
|
||||
attention_mask: Optional[torch.Tensor],
|
||||
cache_position: torch.Tensor,
|
||||
@ -950,7 +950,7 @@ def create_chunked_causal_mask(
|
||||
`modeling_xxx.py` files).
|
||||
|
||||
Args:
|
||||
config (`PretrainedConfig`):
|
||||
config (`PreTrainedConfig`):
|
||||
The model config.
|
||||
input_embeds (`torch.Tensor`):
|
||||
The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the
|
||||
@ -1063,7 +1063,7 @@ LAYER_PATTERN_TO_MASK_FUNCTION_MAPPING = {
|
||||
|
||||
|
||||
def create_masks_for_generate(
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
input_embeds: torch.Tensor,
|
||||
attention_mask: Optional[torch.Tensor],
|
||||
cache_position: torch.Tensor,
|
||||
@ -1078,7 +1078,7 @@ def create_masks_for_generate(
|
||||
in order to easily create the masks in advance, when we compile the forwards with Static caches.
|
||||
|
||||
Args:
|
||||
config (`PretrainedConfig`):
|
||||
config (`PreTrainedConfig`):
|
||||
The model config.
|
||||
input_embeds (`torch.Tensor`):
|
||||
The input embeddings of shape (batch_size, query_length, hidden_dim). This is used only to infer the
|
||||
|
@ -16,7 +16,7 @@ import math
|
||||
from functools import wraps
|
||||
from typing import Optional
|
||||
|
||||
from .configuration_utils import PretrainedConfig
|
||||
from .configuration_utils import PreTrainedConfig
|
||||
from .utils import is_torch_available, logging
|
||||
|
||||
|
||||
@ -90,14 +90,14 @@ def dynamic_rope_update(rope_forward):
|
||||
|
||||
|
||||
def _compute_default_rope_parameters(
|
||||
config: Optional[PretrainedConfig] = None,
|
||||
config: Optional[PreTrainedConfig] = None,
|
||||
device: Optional["torch.device"] = None,
|
||||
seq_len: Optional[int] = None,
|
||||
) -> tuple["torch.Tensor", float]:
|
||||
"""
|
||||
Computes the inverse frequencies according to the original RoPE implementation
|
||||
Args:
|
||||
config ([`~transformers.PretrainedConfig`]):
|
||||
config ([`~transformers.PreTrainedConfig`]):
|
||||
The model configuration. This function assumes that the config will provide at least the following
|
||||
properties:
|
||||
|
||||
@ -133,14 +133,14 @@ def _compute_default_rope_parameters(
|
||||
|
||||
|
||||
def _compute_linear_scaling_rope_parameters(
|
||||
config: Optional[PretrainedConfig] = None,
|
||||
config: Optional[PreTrainedConfig] = None,
|
||||
device: Optional["torch.device"] = None,
|
||||
seq_len: Optional[int] = None,
|
||||
) -> tuple["torch.Tensor", float]:
|
||||
"""
|
||||
Computes the inverse frequencies with linear scaling. Credits to the Reddit user /u/kaiokendev
|
||||
Args:
|
||||
config ([`~transformers.PretrainedConfig`]):
|
||||
config ([`~transformers.PreTrainedConfig`]):
|
||||
The model configuration. This function assumes that the config will provide at least the following
|
||||
properties:
|
||||
|
||||
@ -176,7 +176,7 @@ def _compute_linear_scaling_rope_parameters(
|
||||
|
||||
|
||||
def _compute_dynamic_ntk_parameters(
|
||||
config: Optional[PretrainedConfig] = None,
|
||||
config: Optional[PreTrainedConfig] = None,
|
||||
device: Optional["torch.device"] = None,
|
||||
seq_len: Optional[int] = None,
|
||||
) -> tuple["torch.Tensor", float]:
|
||||
@ -184,7 +184,7 @@ def _compute_dynamic_ntk_parameters(
|
||||
Computes the inverse frequencies with NTK scaling. Credits to the Reddit users /u/bloc97 and /u/emozilla
|
||||
|
||||
Args:
|
||||
config ([`~transformers.PretrainedConfig`]):
|
||||
config ([`~transformers.PreTrainedConfig`]):
|
||||
The model configuration. This function assumes that the config will provide at least the following
|
||||
properties:
|
||||
|
||||
@ -244,14 +244,14 @@ def _compute_dynamic_ntk_parameters(
|
||||
|
||||
|
||||
def _compute_yarn_parameters(
|
||||
config: PretrainedConfig, device: "torch.device", seq_len: Optional[int] = None
|
||||
config: PreTrainedConfig, device: "torch.device", seq_len: Optional[int] = None
|
||||
) -> tuple["torch.Tensor", float]:
|
||||
"""
|
||||
Computes the inverse frequencies with NTK scaling. Please refer to the
|
||||
[original paper](https://huggingface.co/papers/2309.00071)
|
||||
|
||||
Args:
|
||||
config ([`~transformers.PretrainedConfig`]):
|
||||
config ([`~transformers.PreTrainedConfig`]):
|
||||
The model configuration. This function assumes that the config will provide at least the following
|
||||
properties:
|
||||
|
||||
@ -369,14 +369,14 @@ def _compute_yarn_parameters(
|
||||
|
||||
|
||||
def _compute_longrope_parameters(
|
||||
config: PretrainedConfig, device: "torch.device", seq_len: Optional[int] = None
|
||||
config: PreTrainedConfig, device: "torch.device", seq_len: Optional[int] = None
|
||||
) -> tuple["torch.Tensor", float]:
|
||||
"""
|
||||
Computes the inverse frequencies with LongRoPE scaling. Please refer to the
|
||||
[original implementation](https://github.com/microsoft/LongRoPE)
|
||||
|
||||
Args:
|
||||
config ([`~transformers.PretrainedConfig`]):
|
||||
config ([`~transformers.PreTrainedConfig`]):
|
||||
The model configuration. This function assumes that the config will provide at least the following
|
||||
properties:
|
||||
|
||||
@ -451,13 +451,13 @@ def _compute_longrope_parameters(
|
||||
|
||||
|
||||
def _compute_llama3_parameters(
|
||||
config: PretrainedConfig, device: "torch.device", seq_len: Optional[int] = None
|
||||
config: PreTrainedConfig, device: "torch.device", seq_len: Optional[int] = None
|
||||
) -> tuple["torch.Tensor", float]:
|
||||
"""
|
||||
Computes the inverse frequencies for llama 3.1.
|
||||
|
||||
Args:
|
||||
config ([`~transformers.PretrainedConfig`]):
|
||||
config ([`~transformers.PreTrainedConfig`]):
|
||||
The model configuration. This function assumes that the config will provide at least the following
|
||||
properties:
|
||||
|
||||
@ -557,7 +557,7 @@ def _check_received_keys(
|
||||
logger.warning(f"Unrecognized keys in `rope_scaling` for 'rope_type'='{rope_type}': {unused_keys}")
|
||||
|
||||
|
||||
def _validate_default_rope_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None):
|
||||
def _validate_default_rope_parameters(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
|
||||
rope_scaling = config.rope_scaling
|
||||
rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type"
|
||||
required_keys = {"rope_type"}
|
||||
@ -565,7 +565,7 @@ def _validate_default_rope_parameters(config: PretrainedConfig, ignore_keys: Opt
|
||||
_check_received_keys(rope_type, received_keys, required_keys, ignore_keys=ignore_keys)
|
||||
|
||||
|
||||
def _validate_linear_scaling_rope_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None):
|
||||
def _validate_linear_scaling_rope_parameters(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
|
||||
rope_scaling = config.rope_scaling
|
||||
rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type"
|
||||
required_keys = {"rope_type", "factor"}
|
||||
@ -577,7 +577,7 @@ def _validate_linear_scaling_rope_parameters(config: PretrainedConfig, ignore_ke
|
||||
logger.warning(f"`rope_scaling`'s factor field must be a float >= 1, got {factor}")
|
||||
|
||||
|
||||
def _validate_dynamic_scaling_rope_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None):
|
||||
def _validate_dynamic_scaling_rope_parameters(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
|
||||
rope_scaling = config.rope_scaling
|
||||
rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type"
|
||||
required_keys = {"rope_type", "factor"}
|
||||
@ -591,7 +591,7 @@ def _validate_dynamic_scaling_rope_parameters(config: PretrainedConfig, ignore_k
|
||||
logger.warning(f"`rope_scaling`'s factor field must be a float >= 1, got {factor}")
|
||||
|
||||
|
||||
def _validate_yarn_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None):
|
||||
def _validate_yarn_parameters(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
|
||||
rope_scaling = config.rope_scaling
|
||||
rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type"
|
||||
required_keys = {"rope_type", "factor"}
|
||||
@ -657,7 +657,7 @@ def _validate_yarn_parameters(config: PretrainedConfig, ignore_keys: Optional[se
|
||||
)
|
||||
|
||||
|
||||
def _validate_longrope_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None):
|
||||
def _validate_longrope_parameters(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
|
||||
rope_scaling = config.rope_scaling
|
||||
rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type"
|
||||
required_keys = {"rope_type", "short_factor", "long_factor"}
|
||||
@ -707,7 +707,7 @@ def _validate_longrope_parameters(config: PretrainedConfig, ignore_keys: Optiona
|
||||
)
|
||||
|
||||
|
||||
def _validate_llama3_parameters(config: PretrainedConfig, ignore_keys: Optional[set] = None):
|
||||
def _validate_llama3_parameters(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
|
||||
rope_scaling = config.rope_scaling
|
||||
rope_type = rope_scaling.get("rope_type", rope_scaling.get("type", None)) # BC: "rope_type" was originally "type"
|
||||
required_keys = {"rope_type", "factor", "original_max_position_embeddings", "low_freq_factor", "high_freq_factor"}
|
||||
@ -754,11 +754,11 @@ ROPE_VALIDATION_FUNCTIONS = {
|
||||
}
|
||||
|
||||
|
||||
def rope_config_validation(config: PretrainedConfig, ignore_keys: Optional[set] = None):
|
||||
def rope_config_validation(config: PreTrainedConfig, ignore_keys: Optional[set] = None):
|
||||
"""
|
||||
Validate the RoPE config arguments, given a `PretrainedConfig` object
|
||||
Validate the RoPE config arguments, given a `PreTrainedConfig` object
|
||||
"""
|
||||
rope_scaling = getattr(config, "rope_scaling", None) # not a default parameter in `PretrainedConfig`
|
||||
rope_scaling = getattr(config, "rope_scaling", None) # not a default parameter in `PreTrainedConfig`
|
||||
if rope_scaling is None:
|
||||
return
|
||||
|
||||
|
@ -44,7 +44,7 @@ from torch import Tensor, nn
|
||||
from torch.distributions import constraints
|
||||
from torch.utils.checkpoint import checkpoint
|
||||
|
||||
from .configuration_utils import PretrainedConfig
|
||||
from .configuration_utils import PreTrainedConfig
|
||||
from .distributed import DistributedConfig
|
||||
from .dynamic_module_utils import custom_object_save
|
||||
from .generation import CompileConfig, GenerationConfig
|
||||
@ -1149,11 +1149,11 @@ def _get_dtype(
|
||||
cls,
|
||||
dtype: Optional[Union[str, torch.dtype, dict]],
|
||||
checkpoint_files: Optional[list[str]],
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
sharded_metadata: Optional[dict],
|
||||
state_dict: Optional[dict],
|
||||
weights_only: bool,
|
||||
) -> tuple[PretrainedConfig, Optional[torch.dtype], Optional[torch.dtype]]:
|
||||
) -> tuple[PreTrainedConfig, Optional[torch.dtype], Optional[torch.dtype]]:
|
||||
"""Find the correct `dtype` to use based on provided arguments. Also update the `config` based on the
|
||||
inferred dtype. We do the following:
|
||||
1. If dtype is not None, we use that dtype
|
||||
@ -1780,7 +1780,7 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
|
||||
|
||||
Class attributes (overridden by derived classes):
|
||||
|
||||
- **config_class** ([`PretrainedConfig`]) -- A subclass of [`PretrainedConfig`] to use as configuration class
|
||||
- **config_class** ([`PreTrainedConfig`]) -- A subclass of [`PreTrainedConfig`] to use as configuration class
|
||||
for this model architecture.
|
||||
- **base_model_prefix** (`str`) -- A string indicating the attribute associated to the base model in derived
|
||||
classes of the same architecture adding modules on top of the base model.
|
||||
@ -1935,12 +1935,12 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
|
||||
elif full_annotation is not None:
|
||||
cls.config_class = full_annotation
|
||||
|
||||
def __init__(self, config: PretrainedConfig, *inputs, **kwargs):
|
||||
def __init__(self, config: PreTrainedConfig, *inputs, **kwargs):
|
||||
super().__init__()
|
||||
if not isinstance(config, PretrainedConfig):
|
||||
if not isinstance(config, PreTrainedConfig):
|
||||
raise TypeError(
|
||||
f"Parameter config in `{self.__class__.__name__}(config)` should be an instance of class "
|
||||
"`PretrainedConfig`. To create a model from a pretrained model use "
|
||||
"`PreTrainedConfig`. To create a model from a pretrained model use "
|
||||
f"`model = {self.__class__.__name__}.from_pretrained(PRETRAINED_MODEL_NAME)`"
|
||||
)
|
||||
self.config = config
|
||||
@ -4250,7 +4250,7 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
|
||||
cls: type[SpecificPreTrainedModelType],
|
||||
pretrained_model_name_or_path: Optional[Union[str, os.PathLike]],
|
||||
*model_args,
|
||||
config: Optional[Union[PretrainedConfig, str, os.PathLike]] = None,
|
||||
config: Optional[Union[PreTrainedConfig, str, os.PathLike]] = None,
|
||||
cache_dir: Optional[Union[str, os.PathLike]] = None,
|
||||
ignore_mismatched_sizes: bool = False,
|
||||
force_download: bool = False,
|
||||
@ -4285,11 +4285,11 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
|
||||
arguments `config` and `state_dict`).
|
||||
model_args (sequence of positional arguments, *optional*):
|
||||
All remaining positional arguments will be passed to the underlying model's `__init__` method.
|
||||
config (`Union[PretrainedConfig, str, os.PathLike]`, *optional*):
|
||||
config (`Union[PreTrainedConfig, str, os.PathLike]`, *optional*):
|
||||
Can be either:
|
||||
|
||||
- an instance of a class derived from [`PretrainedConfig`],
|
||||
- a string or path valid as input to [`~PretrainedConfig.from_pretrained`].
|
||||
- an instance of a class derived from [`PreTrainedConfig`],
|
||||
- a string or path valid as input to [`~PreTrainedConfig.from_pretrained`].
|
||||
|
||||
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
|
||||
be automatically loaded when:
|
||||
@ -4437,7 +4437,7 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
|
||||
underlying model's `__init__` method (we assume all relevant updates to the configuration have
|
||||
already been done)
|
||||
- If a configuration is not provided, `kwargs` will be first passed to the configuration class
|
||||
initialization function ([`~PretrainedConfig.from_pretrained`]). Each key of `kwargs` that
|
||||
initialization function ([`~PreTrainedConfig.from_pretrained`]). Each key of `kwargs` that
|
||||
corresponds to a configuration attribute will be used to override said attribute with the
|
||||
supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute
|
||||
will be passed to the underlying model's `__init__` function.
|
||||
@ -4574,7 +4574,7 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
|
||||
raise ValueError("accelerate is required when loading a GGUF file `pip install accelerate`.")
|
||||
|
||||
if commit_hash is None:
|
||||
if not isinstance(config, PretrainedConfig):
|
||||
if not isinstance(config, PreTrainedConfig):
|
||||
# We make a call to the config file first (which may be absent) to get the commit hash as soon as possible
|
||||
resolved_config_file = cached_file(
|
||||
pretrained_model_name_or_path,
|
||||
@ -4681,7 +4681,7 @@ class PreTrainedModel(nn.Module, EmbeddingAccessMixin, ModuleUtilsMixin, PushToH
|
||||
local_files_only = True
|
||||
|
||||
# Load config if we don't provide a configuration
|
||||
if not isinstance(config, PretrainedConfig):
|
||||
if not isinstance(config, PreTrainedConfig):
|
||||
config_path = config if config is not None else pretrained_model_name_or_path
|
||||
config, model_kwargs = cls.config_class.from_pretrained(
|
||||
config_path,
|
||||
|
@ -21,22 +21,22 @@
|
||||
|
||||
from typing import Optional
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class Aimv2VisionConfig(PretrainedConfig):
|
||||
class Aimv2VisionConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`Aimv2VisionModel`]. It is used to instantiate a
|
||||
AIMv2 vision encoder according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the vision encoder of the AIMv2
|
||||
[apple/aimv2-large-patch14-224](https://huggingface.co/apple/aimv2-large-patch14-224) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
hidden_size (`int`, *optional*, defaults to 1024):
|
||||
@ -127,15 +127,15 @@ class Aimv2VisionConfig(PretrainedConfig):
|
||||
self.is_native = is_native
|
||||
|
||||
|
||||
class Aimv2TextConfig(PretrainedConfig):
|
||||
class Aimv2TextConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`Aimv2TextModel`]. It is used to instantiate a
|
||||
AIMv2 text encoder according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the text encoder of the AIMv2
|
||||
[apple/aimv2-large-patch14-224-lit](https://huggingface.co/apple/aimv2-large-patch14-224-lit) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 49408):
|
||||
@ -212,15 +212,15 @@ class Aimv2TextConfig(PretrainedConfig):
|
||||
self.rms_norm_eps = rms_norm_eps
|
||||
|
||||
|
||||
class Aimv2Config(PretrainedConfig):
|
||||
class Aimv2Config(PreTrainedConfig):
|
||||
r"""
|
||||
[`Aimv2Config`] is the configuration class to store the configuration of a [`Aimv2Model`]. It is used to
|
||||
instantiate a AIMv2 model according to the specified arguments, defining the text model and vision model configs.
|
||||
Instantiating a configuration with the defaults will yield a similar configuration to that of the AIMv2
|
||||
[apple/aimv2-large-patch14-224-lit](https://huggingface.co/apple/aimv2-large-patch14-224-lit) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
text_config (`dict`, *optional*):
|
||||
|
@ -47,8 +47,8 @@ class Aimv2VisionConfig(SiglipVisionConfig):
|
||||
configuration with the defaults will yield a similar configuration to that of the vision encoder of the AIMv2
|
||||
[apple/aimv2-large-patch14-224](https://huggingface.co/apple/aimv2-large-patch14-224) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
hidden_size (`int`, *optional*, defaults to 1024):
|
||||
@ -147,8 +147,8 @@ class Aimv2TextConfig(SiglipTextConfig):
|
||||
configuration with the defaults will yield a similar configuration to that of the text encoder of the AIMv2
|
||||
[apple/aimv2-large-patch14-224-lit](https://huggingface.co/apple/aimv2-large-patch14-224-lit) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 49408):
|
||||
@ -238,8 +238,8 @@ class Aimv2Config(SiglipConfig):
|
||||
Instantiating a configuration with the defaults will yield a similar configuration to that of the AIMv2
|
||||
[apple/aimv2-large-patch14-224-lit](https://huggingface.co/apple/aimv2-large-patch14-224-lit) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
text_config (`dict`, *optional*):
|
||||
|
@ -18,19 +18,19 @@
|
||||
from collections import OrderedDict
|
||||
from collections.abc import Mapping
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...onnx import OnnxConfig
|
||||
|
||||
|
||||
class AlbertConfig(PretrainedConfig):
|
||||
class AlbertConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`AlbertModel`] or a [`TFAlbertModel`]. It is used
|
||||
to instantiate an ALBERT model according to the specified arguments, defining the model architecture. Instantiating
|
||||
a configuration with the defaults will yield a similar configuration to that of the ALBERT
|
||||
[albert/albert-xxlarge-v2](https://huggingface.co/albert/albert-xxlarge-v2) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 30000):
|
||||
|
@ -14,14 +14,14 @@
|
||||
# limitations under the License.
|
||||
"""ALIGN model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class AlignTextConfig(PretrainedConfig):
|
||||
class AlignTextConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`AlignTextModel`]. It is used to instantiate a
|
||||
ALIGN text encoder according to the specified arguments, defining the model architecture. Instantiating a
|
||||
@ -29,8 +29,8 @@ class AlignTextConfig(PretrainedConfig):
|
||||
[kakaobrain/align-base](https://huggingface.co/kakaobrain/align-base) architecture. The default values here are
|
||||
copied from BERT.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 30522):
|
||||
@ -128,7 +128,7 @@ class AlignTextConfig(PretrainedConfig):
|
||||
self.pad_token_id = pad_token_id
|
||||
|
||||
|
||||
class AlignVisionConfig(PretrainedConfig):
|
||||
class AlignVisionConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`AlignVisionModel`]. It is used to instantiate a
|
||||
ALIGN vision encoder according to the specified arguments, defining the model architecture. Instantiating a
|
||||
@ -136,8 +136,8 @@ class AlignVisionConfig(PretrainedConfig):
|
||||
[kakaobrain/align-base](https://huggingface.co/kakaobrain/align-base) architecture. The default values are copied
|
||||
from EfficientNet (efficientnet-b7)
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
num_channels (`int`, *optional*, defaults to 3):
|
||||
@ -250,15 +250,15 @@ class AlignVisionConfig(PretrainedConfig):
|
||||
self.num_hidden_layers = sum(num_block_repeats) * 4
|
||||
|
||||
|
||||
class AlignConfig(PretrainedConfig):
|
||||
class AlignConfig(PreTrainedConfig):
|
||||
r"""
|
||||
[`AlignConfig`] is the configuration class to store the configuration of a [`AlignModel`]. It is used to
|
||||
instantiate a ALIGN model according to the specified arguments, defining the text model and vision model configs.
|
||||
Instantiating a configuration with the defaults will yield a similar configuration to that of the ALIGN
|
||||
[kakaobrain/align-base](https://huggingface.co/kakaobrain/align-base) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
text_config (`dict`, *optional*):
|
||||
|
@ -14,22 +14,22 @@
|
||||
# limitations under the License.
|
||||
"""AltCLIP model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class AltCLIPTextConfig(PretrainedConfig):
|
||||
class AltCLIPTextConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`AltCLIPTextModel`]. It is used to instantiate a
|
||||
AltCLIP text model according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the AltCLIP
|
||||
[BAAI/AltCLIP](https://huggingface.co/BAAI/AltCLIP) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
@ -139,15 +139,15 @@ class AltCLIPTextConfig(PretrainedConfig):
|
||||
self.project_dim = project_dim
|
||||
|
||||
|
||||
class AltCLIPVisionConfig(PretrainedConfig):
|
||||
class AltCLIPVisionConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`AltCLIPModel`]. It is used to instantiate an
|
||||
AltCLIP model according to the specified arguments, defining the model architecture. Instantiating a configuration
|
||||
with the defaults will yield a similar configuration to that of the AltCLIP
|
||||
[BAAI/AltCLIP](https://huggingface.co/BAAI/AltCLIP) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
@ -232,15 +232,15 @@ class AltCLIPVisionConfig(PretrainedConfig):
|
||||
self.hidden_act = hidden_act
|
||||
|
||||
|
||||
class AltCLIPConfig(PretrainedConfig):
|
||||
class AltCLIPConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`AltCLIPModel`]. It is used to instantiate an
|
||||
AltCLIP model according to the specified arguments, defining the model architecture. Instantiating a configuration
|
||||
with the defaults will yield a similar configuration to that of the AltCLIP
|
||||
[BAAI/AltCLIP](https://huggingface.co/BAAI/AltCLIP) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
text_config (`dict`, *optional*):
|
||||
|
@ -20,19 +20,19 @@
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...modeling_rope_utils import rope_config_validation
|
||||
|
||||
|
||||
class ApertusConfig(PretrainedConfig):
|
||||
class ApertusConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`ApertusModel`]. It is used to instantiate a Apertus
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
defaults will yield a similar configuration to that of the Apertus-8B.
|
||||
e.g. [swiss-ai/Apertus-8B](https://huggingface.co/swiss-ai/Apertus-8B)
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -48,8 +48,8 @@ class ApertusConfig(LlamaConfig):
|
||||
defaults will yield a similar configuration to that of the Apertus-8B.
|
||||
e.g. [swiss-ai/Apertus-8B](https://huggingface.co/swiss-ai/Apertus-8B)
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -19,11 +19,11 @@
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...modeling_rope_utils import rope_config_validation
|
||||
|
||||
|
||||
class ArceeConfig(PretrainedConfig):
|
||||
class ArceeConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`ArceeModel`]. It is used to instantiate an Arcee
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
@ -33,8 +33,8 @@ class ArceeConfig(PretrainedConfig):
|
||||
[arcee-ai/AFM-4.5B](https://huggingface.co/arcee-ai/AFM-4.5B)
|
||||
and were used to build the examples below.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 32000):
|
||||
|
@ -39,8 +39,8 @@ class ArceeConfig(LlamaConfig):
|
||||
[arcee-ai/AFM-4.5B](https://huggingface.co/arcee-ai/AFM-4.5B)
|
||||
and were used to build the examples below.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 32000):
|
||||
|
@ -20,12 +20,12 @@
|
||||
# limitations under the License.
|
||||
from typing import Optional
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...modeling_rope_utils import rope_config_validation
|
||||
from ..auto import CONFIG_MAPPING, AutoConfig
|
||||
|
||||
|
||||
class AriaTextConfig(PretrainedConfig):
|
||||
class AriaTextConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This class handles the configuration for the text component of the Aria model.
|
||||
Instantiating a configuration with the defaults will yield a similar configuration to that of the model of the Aria
|
||||
@ -220,15 +220,15 @@ class AriaTextConfig(PretrainedConfig):
|
||||
self.moe_num_shared_experts = moe_num_shared_experts
|
||||
|
||||
|
||||
class AriaConfig(PretrainedConfig):
|
||||
class AriaConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This class handles the configuration for both vision and text components of the Aria model,
|
||||
as well as additional parameters for image token handling and projector mapping.
|
||||
Instantiating a configuration with the defaults will yield a similar configuration to that of the model of the Aria
|
||||
[rhymes-ai/Aria](https://huggingface.co/rhymes-ai/Aria) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vision_config (`AriaVisionConfig` or `dict`, *optional*):
|
||||
|
@ -21,7 +21,7 @@ from torch import nn
|
||||
|
||||
from ...activations import ACT2FN
|
||||
from ...cache_utils import Cache
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...image_processing_utils import BaseImageProcessor, BatchFeature, get_patch_output_size, select_best_resolution
|
||||
from ...image_transforms import PaddingMode, convert_to_rgb, pad, resize, to_channel_dimension_format
|
||||
from ...image_utils import (
|
||||
@ -221,15 +221,15 @@ class AriaTextConfig(LlamaConfig):
|
||||
self.moe_num_shared_experts = moe_num_shared_experts
|
||||
|
||||
|
||||
class AriaConfig(PretrainedConfig):
|
||||
class AriaConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This class handles the configuration for both vision and text components of the Aria model,
|
||||
as well as additional parameters for image token handling and projector mapping.
|
||||
Instantiating a configuration with the defaults will yield a similar configuration to that of the model of the Aria
|
||||
[rhymes-ai/Aria](https://huggingface.co/rhymes-ai/Aria) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vision_config (`AriaVisionConfig` or `dict`, *optional*):
|
||||
|
@ -16,14 +16,14 @@
|
||||
|
||||
from typing import Any
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class ASTConfig(PretrainedConfig):
|
||||
class ASTConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`ASTModel`]. It is used to instantiate an AST
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
@ -31,8 +31,8 @@ class ASTConfig(PretrainedConfig):
|
||||
[MIT/ast-finetuned-audioset-10-10-0.4593](https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593)
|
||||
architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
hidden_size (`int`, *optional*, defaults to 768):
|
||||
|
@ -23,7 +23,7 @@ from collections import OrderedDict
|
||||
from collections.abc import Iterator
|
||||
from typing import Any, TypeVar, Union
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
|
||||
from ...utils import (
|
||||
CONFIG_NAME,
|
||||
@ -65,7 +65,7 @@ FROM_CONFIG_DOCSTRING = """
|
||||
model's configuration. Use [`~BaseAutoModelClass.from_pretrained`] to load the model weights.
|
||||
|
||||
Args:
|
||||
config ([`PretrainedConfig`]):
|
||||
config ([`PreTrainedConfig`]):
|
||||
The model class to instantiate is selected based on the configuration class:
|
||||
|
||||
List options
|
||||
@ -104,7 +104,7 @@ FROM_PRETRAINED_TORCH_DOCSTRING = """
|
||||
[`~PreTrainedModel.save_pretrained`], e.g., `./my_model_directory/`.
|
||||
model_args (additional positional arguments, *optional*):
|
||||
Will be passed along to the underlying model `__init__()` method.
|
||||
config ([`PretrainedConfig`], *optional*):
|
||||
config ([`PreTrainedConfig`], *optional*):
|
||||
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
|
||||
be automatically loaded when:
|
||||
|
||||
@ -155,7 +155,7 @@ FROM_PRETRAINED_TORCH_DOCSTRING = """
|
||||
underlying model's `__init__` method (we assume all relevant updates to the configuration have
|
||||
already been done)
|
||||
- If a configuration is not provided, `kwargs` will be first passed to the configuration class
|
||||
initialization function ([`~PretrainedConfig.from_pretrained`]). Each key of `kwargs` that
|
||||
initialization function ([`~PreTrainedConfig.from_pretrained`]). Each key of `kwargs` that
|
||||
corresponds to a configuration attribute will be used to override said attribute with the
|
||||
supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute
|
||||
will be passed to the underlying model's `__init__` function.
|
||||
@ -243,7 +243,7 @@ class _BaseAutoModelClass:
|
||||
)
|
||||
|
||||
@classmethod
|
||||
def _prepare_config_for_auto_class(cls, config: PretrainedConfig) -> PretrainedConfig:
|
||||
def _prepare_config_for_auto_class(cls, config: PreTrainedConfig) -> PreTrainedConfig:
|
||||
"""Additional autoclass-specific config post-loading manipulation. May be overridden in subclasses."""
|
||||
return config
|
||||
|
||||
@ -284,7 +284,7 @@ class _BaseAutoModelClass:
|
||||
hub_kwargs["token"] = token
|
||||
|
||||
if commit_hash is None:
|
||||
if not isinstance(config, PretrainedConfig):
|
||||
if not isinstance(config, PreTrainedConfig):
|
||||
# We make a call to the config file first (which may be absent) to get the commit hash as soon as possible
|
||||
resolved_config_file = cached_file(
|
||||
pretrained_model_name_or_path,
|
||||
@ -315,7 +315,7 @@ class _BaseAutoModelClass:
|
||||
adapter_kwargs["_adapter_model_path"] = pretrained_model_name_or_path
|
||||
pretrained_model_name_or_path = adapter_config["base_model_name_or_path"]
|
||||
|
||||
if not isinstance(config, PretrainedConfig):
|
||||
if not isinstance(config, PreTrainedConfig):
|
||||
kwargs_orig = copy.deepcopy(kwargs)
|
||||
# ensure not to pollute the config object with dtype="auto" - since it's
|
||||
# meaningless in the context of the config object - torch.dtype values are acceptable
|
||||
@ -396,7 +396,7 @@ class _BaseAutoModelClass:
|
||||
Register a new model for this class.
|
||||
|
||||
Args:
|
||||
config_class ([`PretrainedConfig`]):
|
||||
config_class ([`PreTrainedConfig`]):
|
||||
The configuration corresponding to the model to register.
|
||||
model_class ([`PreTrainedModel`]):
|
||||
The model to register.
|
||||
@ -553,7 +553,7 @@ def add_generation_mixin_to_remote_model(model_class):
|
||||
return model_class
|
||||
|
||||
|
||||
class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue]):
|
||||
class _LazyAutoMapping(OrderedDict[type[PreTrainedConfig], _LazyAutoMappingValue]):
|
||||
"""
|
||||
" A mapping config to object (model or tokenizer for instance) that will load keys and values when it is accessed.
|
||||
|
||||
@ -574,7 +574,7 @@ class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue
|
||||
common_keys = set(self._config_mapping.keys()).intersection(self._model_mapping.keys())
|
||||
return len(common_keys) + len(self._extra_content)
|
||||
|
||||
def __getitem__(self, key: type[PretrainedConfig]) -> _LazyAutoMappingValue:
|
||||
def __getitem__(self, key: type[PreTrainedConfig]) -> _LazyAutoMappingValue:
|
||||
if key in self._extra_content:
|
||||
return self._extra_content[key]
|
||||
model_type = self._reverse_config_mapping[key.__name__]
|
||||
@ -596,7 +596,7 @@ class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue
|
||||
self._modules[module_name] = importlib.import_module(f".{module_name}", "transformers.models")
|
||||
return getattribute_from_module(self._modules[module_name], attr)
|
||||
|
||||
def keys(self) -> list[type[PretrainedConfig]]:
|
||||
def keys(self) -> list[type[PreTrainedConfig]]:
|
||||
mapping_keys = [
|
||||
self._load_attr_from_module(key, name)
|
||||
for key, name in self._config_mapping.items()
|
||||
@ -604,7 +604,7 @@ class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue
|
||||
]
|
||||
return mapping_keys + list(self._extra_content.keys())
|
||||
|
||||
def get(self, key: type[PretrainedConfig], default: _T) -> Union[_LazyAutoMappingValue, _T]:
|
||||
def get(self, key: type[PreTrainedConfig], default: _T) -> Union[_LazyAutoMappingValue, _T]:
|
||||
try:
|
||||
return self.__getitem__(key)
|
||||
except KeyError:
|
||||
@ -621,7 +621,7 @@ class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue
|
||||
]
|
||||
return mapping_values + list(self._extra_content.values())
|
||||
|
||||
def items(self) -> list[tuple[type[PretrainedConfig], _LazyAutoMappingValue]]:
|
||||
def items(self) -> list[tuple[type[PreTrainedConfig], _LazyAutoMappingValue]]:
|
||||
mapping_items = [
|
||||
(
|
||||
self._load_attr_from_module(key, self._config_mapping[key]),
|
||||
@ -632,7 +632,7 @@ class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue
|
||||
]
|
||||
return mapping_items + list(self._extra_content.items())
|
||||
|
||||
def __iter__(self) -> Iterator[type[PretrainedConfig]]:
|
||||
def __iter__(self) -> Iterator[type[PreTrainedConfig]]:
|
||||
return iter(self.keys())
|
||||
|
||||
def __contains__(self, item: type) -> bool:
|
||||
@ -643,7 +643,7 @@ class _LazyAutoMapping(OrderedDict[type[PretrainedConfig], _LazyAutoMappingValue
|
||||
model_type = self._reverse_config_mapping[item.__name__]
|
||||
return model_type in self._model_mapping
|
||||
|
||||
def register(self, key: type[PretrainedConfig], value: _LazyAutoMappingValue, exist_ok=False) -> None:
|
||||
def register(self, key: type[PreTrainedConfig], value: _LazyAutoMappingValue, exist_ok=False) -> None:
|
||||
"""
|
||||
Register a new model in this mapping.
|
||||
"""
|
||||
|
@ -22,7 +22,7 @@ from collections import OrderedDict
|
||||
from collections.abc import Callable, Iterator, KeysView, ValuesView
|
||||
from typing import Any, TypeVar, Union
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
|
||||
from ...utils import CONFIG_NAME, logging
|
||||
|
||||
@ -1031,7 +1031,7 @@ def config_class_to_model_type(config) -> Union[str, None]:
|
||||
return None
|
||||
|
||||
|
||||
class _LazyConfigMapping(OrderedDict[str, type[PretrainedConfig]]):
|
||||
class _LazyConfigMapping(OrderedDict[str, type[PreTrainedConfig]]):
|
||||
"""
|
||||
A dictionary that lazily load its values when they are requested.
|
||||
"""
|
||||
@ -1041,7 +1041,7 @@ class _LazyConfigMapping(OrderedDict[str, type[PretrainedConfig]]):
|
||||
self._extra_content = {}
|
||||
self._modules = {}
|
||||
|
||||
def __getitem__(self, key: str) -> type[PretrainedConfig]:
|
||||
def __getitem__(self, key: str) -> type[PreTrainedConfig]:
|
||||
if key in self._extra_content:
|
||||
return self._extra_content[key]
|
||||
if key not in self._mapping:
|
||||
@ -1061,10 +1061,10 @@ class _LazyConfigMapping(OrderedDict[str, type[PretrainedConfig]]):
|
||||
def keys(self) -> list[str]:
|
||||
return list(self._mapping.keys()) + list(self._extra_content.keys())
|
||||
|
||||
def values(self) -> list[type[PretrainedConfig]]:
|
||||
def values(self) -> list[type[PreTrainedConfig]]:
|
||||
return [self[k] for k in self._mapping] + list(self._extra_content.values())
|
||||
|
||||
def items(self) -> list[tuple[str, type[PretrainedConfig]]]:
|
||||
def items(self) -> list[tuple[str, type[PreTrainedConfig]]]:
|
||||
return [(k, self[k]) for k in self._mapping] + list(self._extra_content.items())
|
||||
|
||||
def __iter__(self) -> Iterator[str]:
|
||||
@ -1073,7 +1073,7 @@ class _LazyConfigMapping(OrderedDict[str, type[PretrainedConfig]]):
|
||||
def __contains__(self, item: object) -> bool:
|
||||
return item in self._mapping or item in self._extra_content
|
||||
|
||||
def register(self, key: str, value: type[PretrainedConfig], exist_ok=False) -> None:
|
||||
def register(self, key: str, value: type[PreTrainedConfig], exist_ok=False) -> None:
|
||||
"""
|
||||
Register a new configuration in this mapping.
|
||||
"""
|
||||
@ -1219,7 +1219,7 @@ class AutoConfig:
|
||||
)
|
||||
|
||||
@classmethod
|
||||
def for_model(cls, model_type: str, *args, **kwargs) -> PretrainedConfig:
|
||||
def for_model(cls, model_type: str, *args, **kwargs) -> PreTrainedConfig:
|
||||
if model_type in CONFIG_MAPPING:
|
||||
config_class = CONFIG_MAPPING[model_type]
|
||||
return config_class(*args, **kwargs)
|
||||
@ -1245,7 +1245,7 @@ class AutoConfig:
|
||||
- A string, the *model id* of a pretrained model configuration hosted inside a model repo on
|
||||
huggingface.co.
|
||||
- A path to a *directory* containing a configuration file saved using the
|
||||
[`~PretrainedConfig.save_pretrained`] method, or the [`~PreTrainedModel.save_pretrained`] method,
|
||||
[`~PreTrainedConfig.save_pretrained`] method, or the [`~PreTrainedModel.save_pretrained`] method,
|
||||
e.g., `./my_model_directory/`.
|
||||
- A path or url to a saved configuration JSON *file*, e.g.,
|
||||
`./my_model_directory/configuration.json`.
|
||||
@ -1326,7 +1326,7 @@ class AutoConfig:
|
||||
trust_remote_code = kwargs.pop("trust_remote_code", None)
|
||||
code_revision = kwargs.pop("code_revision", None)
|
||||
|
||||
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
|
||||
config_dict, unused_kwargs = PreTrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
|
||||
has_remote_code = "auto_map" in config_dict and "AutoConfig" in config_dict["auto_map"]
|
||||
has_local_code = "model_type" in config_dict and config_dict["model_type"] in CONFIG_MAPPING
|
||||
if has_remote_code:
|
||||
@ -1387,9 +1387,9 @@ class AutoConfig:
|
||||
|
||||
Args:
|
||||
model_type (`str`): The model type like "bert" or "gpt".
|
||||
config ([`PretrainedConfig`]): The config to register.
|
||||
config ([`PreTrainedConfig`]): The config to register.
|
||||
"""
|
||||
if issubclass(config, PretrainedConfig) and config.model_type != model_type:
|
||||
if issubclass(config, PreTrainedConfig) and config.model_type != model_type:
|
||||
raise ValueError(
|
||||
"The config you are passing has a `model_type` attribute that is not consistent with the model type "
|
||||
f"you passed (config has {config.model_type} and you passed {model_type}. Fix one of those so they "
|
||||
|
@ -22,7 +22,7 @@ from collections import OrderedDict
|
||||
from typing import Optional, Union
|
||||
|
||||
# Build the list of all feature extractors
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
|
||||
from ...feature_extraction_utils import FeatureExtractionMixin
|
||||
from ...utils import CONFIG_NAME, FEATURE_EXTRACTOR_NAME, cached_file, logging
|
||||
@ -309,7 +309,7 @@ class AutoFeatureExtractor:
|
||||
|
||||
# If we don't find the feature extractor class in the feature extractor config, let's try the model config.
|
||||
if feature_extractor_class is None and feature_extractor_auto_map is None:
|
||||
if not isinstance(config, PretrainedConfig):
|
||||
if not isinstance(config, PreTrainedConfig):
|
||||
config = AutoConfig.from_pretrained(
|
||||
pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
|
||||
)
|
||||
@ -358,7 +358,7 @@ class AutoFeatureExtractor:
|
||||
Register a new feature extractor for this class.
|
||||
|
||||
Args:
|
||||
config_class ([`PretrainedConfig`]):
|
||||
config_class ([`PreTrainedConfig`]):
|
||||
The configuration corresponding to the model to register.
|
||||
feature_extractor_class ([`FeatureExtractorMixin`]): The feature extractor to register.
|
||||
"""
|
||||
|
@ -22,7 +22,7 @@ from collections import OrderedDict
|
||||
from typing import TYPE_CHECKING, Optional, Union
|
||||
|
||||
# Build the list of all image processors
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
|
||||
from ...image_processing_utils import ImageProcessingMixin
|
||||
from ...image_processing_utils_fast import BaseImageProcessorFast
|
||||
@ -502,7 +502,7 @@ class AutoImageProcessor:
|
||||
|
||||
# If we don't find the image processor class in the image processor config, let's try the model config.
|
||||
if image_processor_type is None and image_processor_auto_map is None:
|
||||
if not isinstance(config, PretrainedConfig):
|
||||
if not isinstance(config, PreTrainedConfig):
|
||||
config = AutoConfig.from_pretrained(
|
||||
pretrained_model_name_or_path,
|
||||
trust_remote_code=trust_remote_code,
|
||||
@ -629,7 +629,7 @@ class AutoImageProcessor:
|
||||
Register a new image processor for this class.
|
||||
|
||||
Args:
|
||||
config_class ([`PretrainedConfig`]):
|
||||
config_class ([`PreTrainedConfig`]):
|
||||
The configuration corresponding to the model to register.
|
||||
image_processor_class ([`ImageProcessingMixin`]): The image processor to register.
|
||||
"""
|
||||
|
@ -21,7 +21,7 @@ import warnings
|
||||
from collections import OrderedDict
|
||||
|
||||
# Build the list of all feature extractors
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
|
||||
from ...feature_extraction_utils import FeatureExtractionMixin
|
||||
from ...image_processing_utils import ImageProcessingMixin
|
||||
@ -356,7 +356,7 @@ class AutoProcessor:
|
||||
|
||||
if processor_class is None:
|
||||
# Otherwise, load config, if it can be loaded.
|
||||
if not isinstance(config, PretrainedConfig):
|
||||
if not isinstance(config, PreTrainedConfig):
|
||||
config = AutoConfig.from_pretrained(
|
||||
pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
|
||||
)
|
||||
@ -430,7 +430,7 @@ class AutoProcessor:
|
||||
Register a new processor for this class.
|
||||
|
||||
Args:
|
||||
config_class ([`PretrainedConfig`]):
|
||||
config_class ([`PreTrainedConfig`]):
|
||||
The configuration corresponding to the model to register.
|
||||
processor_class ([`ProcessorMixin`]): The processor to register.
|
||||
"""
|
||||
|
@ -23,7 +23,7 @@ from typing import Any, Optional, Union
|
||||
|
||||
from transformers.utils.import_utils import is_mistral_common_available
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
|
||||
from ...modeling_gguf_pytorch_utils import load_gguf_checkpoint
|
||||
from ...tokenization_utils import PreTrainedTokenizer
|
||||
@ -962,7 +962,7 @@ class AutoTokenizer:
|
||||
applicable to all derived classes)
|
||||
inputs (additional positional arguments, *optional*):
|
||||
Will be passed along to the Tokenizer `__init__()` method.
|
||||
config ([`PretrainedConfig`], *optional*)
|
||||
config ([`PreTrainedConfig`], *optional*)
|
||||
The configuration object used to determine the tokenizer class to instantiate.
|
||||
cache_dir (`str` or `os.PathLike`, *optional*):
|
||||
Path to a directory in which a downloaded pretrained model configuration should be cached if the
|
||||
@ -1076,7 +1076,7 @@ class AutoTokenizer:
|
||||
|
||||
# If that did not work, let's try to use the config.
|
||||
if config_tokenizer_class is None:
|
||||
if not isinstance(config, PretrainedConfig):
|
||||
if not isinstance(config, PreTrainedConfig):
|
||||
if gguf_file:
|
||||
gguf_path = cached_file(pretrained_model_name_or_path, gguf_file, **kwargs)
|
||||
config_dict = load_gguf_checkpoint(gguf_path, return_tensors=False)["config"]
|
||||
@ -1170,7 +1170,7 @@ class AutoTokenizer:
|
||||
|
||||
|
||||
Args:
|
||||
config_class ([`PretrainedConfig`]):
|
||||
config_class ([`PreTrainedConfig`]):
|
||||
The configuration corresponding to the model to register.
|
||||
slow_tokenizer_class ([`PretrainedTokenizer`], *optional*):
|
||||
The slow tokenizer to register.
|
||||
|
@ -22,7 +22,7 @@ from collections import OrderedDict
|
||||
from typing import TYPE_CHECKING, Optional, Union
|
||||
|
||||
# Build the list of all video processors
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...dynamic_module_utils import get_class_from_dynamic_module, resolve_trust_remote_code
|
||||
from ...utils import CONFIG_NAME, VIDEO_PROCESSOR_NAME, cached_file, is_torchvision_available, logging
|
||||
from ...utils.import_utils import requires
|
||||
@ -321,7 +321,7 @@ class AutoVideoProcessor:
|
||||
|
||||
# If we don't find the video processor class in the video processor config, let's try the model config.
|
||||
if video_processor_class is None and video_processor_auto_map is None:
|
||||
if not isinstance(config, PretrainedConfig):
|
||||
if not isinstance(config, PreTrainedConfig):
|
||||
config = AutoConfig.from_pretrained(
|
||||
pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
|
||||
)
|
||||
@ -374,7 +374,7 @@ class AutoVideoProcessor:
|
||||
Register a new video processor for this class.
|
||||
|
||||
Args:
|
||||
config_class ([`PretrainedConfig`]):
|
||||
config_class ([`PreTrainedConfig`]):
|
||||
The configuration corresponding to the model to register.
|
||||
video_processor_class ([`BaseVideoProcessor`]):
|
||||
The video processor to register.
|
||||
|
@ -16,14 +16,14 @@
|
||||
|
||||
from typing import Optional
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class AutoformerConfig(PretrainedConfig):
|
||||
class AutoformerConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of an [`AutoformerModel`]. It is used to instantiate an
|
||||
Autoformer model according to the specified arguments, defining the model architecture. Instantiating a
|
||||
@ -31,8 +31,8 @@ class AutoformerConfig(PretrainedConfig):
|
||||
[huggingface/autoformer-tourism-monthly](https://huggingface.co/huggingface/autoformer-tourism-monthly)
|
||||
architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
prediction_length (`int`):
|
||||
|
@ -14,7 +14,7 @@
|
||||
# limitations under the License.
|
||||
"""AyaVision model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
from ..auto import CONFIG_MAPPING, AutoConfig
|
||||
|
||||
@ -22,15 +22,15 @@ from ..auto import CONFIG_MAPPING, AutoConfig
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class AyaVisionConfig(PretrainedConfig):
|
||||
class AyaVisionConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`AyaVisionForConditionalGeneration`]. It is used to instantiate an
|
||||
AyaVision model according to the specified arguments, defining the model architecture. Instantiating a configuration
|
||||
with the defaults will yield a similar configuration to that of AyaVision.
|
||||
e.g. [CohereForAI/aya-vision-8b](https://huggingface.co/CohereForAI/aya-vision-8b)
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vision_config (`Union[AutoConfig, dict]`, *optional*, defaults to `SiglipVisionConfig`):
|
||||
|
@ -14,14 +14,14 @@
|
||||
# limitations under the License.
|
||||
"""Bamba model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BambaConfig(PretrainedConfig):
|
||||
class BambaConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BambaModel`]. It is used to instantiate a
|
||||
BambaModel model according to the specified arguments, defining the model architecture. Instantiating a configuration
|
||||
@ -30,8 +30,8 @@ class BambaConfig(PretrainedConfig):
|
||||
The BambaModel is a hybrid [mamba2](https://github.com/state-spaces/mamba) architecture with SwiGLU.
|
||||
The checkpoints are jointly trained by IBM, Princeton, and UIUC.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 128000):
|
||||
|
@ -16,7 +16,7 @@
|
||||
|
||||
from typing import Optional
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import add_start_docstrings, logging
|
||||
from ..auto import CONFIG_MAPPING, AutoConfig
|
||||
|
||||
@ -30,8 +30,8 @@ BARK_SUBMODELCONFIG_START_DOCSTRING = """
|
||||
defaults will yield a similar configuration to that of the Bark [suno/bark](https://huggingface.co/suno/bark)
|
||||
architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
block_size (`int`, *optional*, defaults to 1024):
|
||||
@ -62,7 +62,7 @@ BARK_SUBMODELCONFIG_START_DOCSTRING = """
|
||||
"""
|
||||
|
||||
|
||||
class BarkSubModelConfig(PretrainedConfig):
|
||||
class BarkSubModelConfig(PreTrainedConfig):
|
||||
keys_to_ignore_at_inference = ["past_key_values"]
|
||||
|
||||
attribute_map = {
|
||||
@ -180,7 +180,7 @@ class BarkFineConfig(BarkSubModelConfig):
|
||||
super().__init__(tie_word_embeddings=tie_word_embeddings, **kwargs)
|
||||
|
||||
|
||||
class BarkConfig(PretrainedConfig):
|
||||
class BarkConfig(PreTrainedConfig):
|
||||
"""
|
||||
This is the configuration class to store the configuration of a [`BarkModel`]. It is used to instantiate a Bark
|
||||
model according to the specified sub-models configurations, defining the model architecture.
|
||||
@ -188,8 +188,8 @@ class BarkConfig(PretrainedConfig):
|
||||
Instantiating a configuration with the defaults will yield a similar configuration to that of the Bark
|
||||
[suno/bark](https://huggingface.co/suno/bark) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
semantic_config ([`BarkSemanticConfig`], *optional*):
|
||||
@ -282,7 +282,7 @@ class BarkConfig(PretrainedConfig):
|
||||
semantic_config: BarkSemanticConfig,
|
||||
coarse_acoustics_config: BarkCoarseConfig,
|
||||
fine_acoustics_config: BarkFineConfig,
|
||||
codec_config: PretrainedConfig,
|
||||
codec_config: PreTrainedConfig,
|
||||
**kwargs,
|
||||
):
|
||||
r"""
|
||||
|
@ -315,7 +315,7 @@ class BarkGenerationConfig(GenerationConfig):
|
||||
|
||||
def to_dict(self):
|
||||
"""
|
||||
Serializes this instance to a Python dictionary. Override the default [`~PretrainedConfig.to_dict`].
|
||||
Serializes this instance to a Python dictionary. Override the default [`~PreTrainedConfig.to_dict`].
|
||||
|
||||
Returns:
|
||||
`dict[str, any]`: Dictionary of all the attributes that make up this configuration instance,
|
||||
|
@ -20,7 +20,7 @@ from collections.abc import Mapping
|
||||
from typing import Any
|
||||
|
||||
from ... import PreTrainedTokenizer
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
|
||||
from ...onnx.utils import compute_effective_axis_dimension
|
||||
from ...utils import is_torch_available, logging
|
||||
@ -29,15 +29,15 @@ from ...utils import is_torch_available, logging
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BartConfig(PretrainedConfig):
|
||||
class BartConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BartModel`]. It is used to instantiate a BART
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
defaults will yield a similar configuration to that of the BART
|
||||
[facebook/bart-large](https://huggingface.co/facebook/bart-large) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -20,12 +20,12 @@ from collections.abc import Mapping
|
||||
|
||||
from packaging import version
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...onnx import OnnxConfig
|
||||
from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_features_output_indices
|
||||
|
||||
|
||||
class BeitConfig(BackboneConfigMixin, PretrainedConfig):
|
||||
class BeitConfig(BackboneConfigMixin, PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BeitModel`]. It is used to instantiate an BEiT
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
|
@ -18,7 +18,7 @@
|
||||
from collections import OrderedDict
|
||||
from collections.abc import Mapping
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...onnx import OnnxConfig
|
||||
from ...utils import logging
|
||||
|
||||
@ -26,15 +26,15 @@ from ...utils import logging
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BertConfig(PretrainedConfig):
|
||||
class BertConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BertModel`] or a [`TFBertModel`]. It is used to
|
||||
instantiate a BERT model according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the BERT
|
||||
[google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -14,10 +14,10 @@
|
||||
# limitations under the License.
|
||||
"""BertGeneration model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
|
||||
|
||||
class BertGenerationConfig(PretrainedConfig):
|
||||
class BertGenerationConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BertGenerationPreTrainedModel`]. It is used to
|
||||
instantiate a BertGeneration model according to the specified arguments, defining the model architecture.
|
||||
@ -25,8 +25,8 @@ class BertGenerationConfig(PretrainedConfig):
|
||||
[google/bert_for_seq_generation_L-24_bbc_encoder](https://huggingface.co/google/bert_for_seq_generation_L-24_bbc_encoder)
|
||||
architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 50358):
|
||||
|
@ -17,7 +17,7 @@
|
||||
from collections import OrderedDict
|
||||
from collections.abc import Mapping
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...onnx import OnnxConfig
|
||||
from ...utils import logging
|
||||
|
||||
@ -25,15 +25,15 @@ from ...utils import logging
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BigBirdConfig(PretrainedConfig):
|
||||
class BigBirdConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BigBirdModel`]. It is used to instantiate an
|
||||
BigBird model according to the specified arguments, defining the model architecture. Instantiating a configuration
|
||||
with the defaults will yield a similar configuration to that of the BigBird
|
||||
[google/bigbird-roberta-base](https://huggingface.co/google/bigbird-roberta-base) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -19,7 +19,7 @@ from collections.abc import Mapping
|
||||
from typing import Any
|
||||
|
||||
from ... import PreTrainedTokenizer
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
|
||||
from ...onnx.utils import compute_effective_axis_dimension
|
||||
from ...utils import is_torch_available, logging
|
||||
@ -28,15 +28,15 @@ from ...utils import is_torch_available, logging
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BigBirdPegasusConfig(PretrainedConfig):
|
||||
class BigBirdPegasusConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BigBirdPegasusModel`]. It is used to instantiate
|
||||
an BigBirdPegasus model according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the BigBirdPegasus
|
||||
[google/bigbird-pegasus-large-arxiv](https://huggingface.co/google/bigbird-pegasus-large-arxiv) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -14,22 +14,22 @@
|
||||
# limitations under the License.
|
||||
"""BioGPT model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BioGptConfig(PretrainedConfig):
|
||||
class BioGptConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BioGptModel`]. It is used to instantiate an
|
||||
BioGPT model according to the specified arguments, defining the model architecture. Instantiating a configuration
|
||||
with the defaults will yield a similar configuration to that of the BioGPT
|
||||
[microsoft/biogpt](https://huggingface.co/microsoft/biogpt) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -14,7 +14,7 @@
|
||||
# limitations under the License.
|
||||
"""BiT model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_features_output_indices
|
||||
|
||||
@ -22,15 +22,15 @@ from ...utils.backbone_utils import BackboneConfigMixin, get_aligned_output_feat
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BitConfig(BackboneConfigMixin, PretrainedConfig):
|
||||
class BitConfig(BackboneConfigMixin, PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BitModel`]. It is used to instantiate an BiT
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
defaults will yield a similar configuration to that of the BiT
|
||||
[google/bit-50](https://huggingface.co/google/bit-50) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
num_channels (`int`, *optional*, defaults to 3):
|
||||
|
@ -13,22 +13,22 @@
|
||||
# See the License for the specific language governing permissions and
|
||||
"""BitNet model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BitNetConfig(PretrainedConfig):
|
||||
class BitNetConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BitNetModel`]. It is used to instantiate an BitNet
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
defaults will yield a similar configuration to that of
|
||||
BitNet b1.58 2B4T [microsoft/bitnet-b1.58-2B-4T](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T).
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -19,7 +19,7 @@ from collections.abc import Mapping
|
||||
from typing import Any
|
||||
|
||||
from ... import PreTrainedTokenizer
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...file_utils import is_torch_available
|
||||
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
|
||||
from ...onnx.utils import compute_effective_axis_dimension
|
||||
@ -29,15 +29,15 @@ from ...utils import logging
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BlenderbotConfig(PretrainedConfig):
|
||||
class BlenderbotConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BlenderbotModel`]. It is used to instantiate an
|
||||
Blenderbot model according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the Blenderbot
|
||||
[facebook/blenderbot-3B](https://huggingface.co/facebook/blenderbot-3B) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -19,7 +19,7 @@ from collections.abc import Mapping
|
||||
from typing import Any
|
||||
|
||||
from ... import PreTrainedTokenizer
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...file_utils import is_torch_available
|
||||
from ...onnx import OnnxConfig, OnnxConfigWithPast, OnnxSeq2SeqConfigWithPast
|
||||
from ...onnx.utils import compute_effective_axis_dimension
|
||||
@ -29,15 +29,15 @@ from ...utils import logging
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BlenderbotSmallConfig(PretrainedConfig):
|
||||
class BlenderbotSmallConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BlenderbotSmallModel`]. It is used to instantiate
|
||||
an BlenderbotSmall model according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the BlenderbotSmall
|
||||
[facebook/blenderbot_small-90M](https://huggingface.co/facebook/blenderbot_small-90M) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -14,22 +14,22 @@
|
||||
# limitations under the License.
|
||||
"""Blip model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BlipTextConfig(PretrainedConfig):
|
||||
class BlipTextConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BlipTextModel`]. It is used to instantiate a BLIP
|
||||
text model according to the specified arguments, defining the model architecture. Instantiating a configuration
|
||||
with the defaults will yield a similar configuration to that of the `BlipText` used by the [base
|
||||
architectures](https://huggingface.co/Salesforce/blip-vqa-base).
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
@ -145,15 +145,15 @@ class BlipTextConfig(PretrainedConfig):
|
||||
self.label_smoothing = label_smoothing
|
||||
|
||||
|
||||
class BlipVisionConfig(PretrainedConfig):
|
||||
class BlipVisionConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BlipVisionModel`]. It is used to instantiate a
|
||||
BLIP vision model according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration defaults will yield a similar configuration to that of the Blip-base
|
||||
[Salesforce/blip-vqa-base](https://huggingface.co/Salesforce/blip-vqa-base) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
@ -227,15 +227,15 @@ class BlipVisionConfig(PretrainedConfig):
|
||||
self.hidden_act = hidden_act
|
||||
|
||||
|
||||
class BlipConfig(PretrainedConfig):
|
||||
class BlipConfig(PreTrainedConfig):
|
||||
r"""
|
||||
[`BlipConfig`] is the configuration class to store the configuration of a [`BlipModel`]. It is used to instantiate
|
||||
a BLIP model according to the specified arguments, defining the text model and vision model configs. Instantiating
|
||||
a configuration with the defaults will yield a similar configuration to that of the BLIP-base
|
||||
[Salesforce/blip-vqa-base](https://huggingface.co/Salesforce/blip-vqa-base) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
text_config (`dict`, *optional*):
|
||||
|
@ -16,7 +16,7 @@
|
||||
|
||||
from typing import Optional
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...models.auto.modeling_auto import MODEL_FOR_CAUSAL_LM_MAPPING_NAMES
|
||||
from ...utils import logging
|
||||
from ..auto import CONFIG_MAPPING, AutoConfig
|
||||
@ -25,15 +25,15 @@ from ..auto import CONFIG_MAPPING, AutoConfig
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class Blip2VisionConfig(PretrainedConfig):
|
||||
class Blip2VisionConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`Blip2VisionModel`]. It is used to instantiate a
|
||||
BLIP-2 vision encoder according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration defaults will yield a similar configuration to that of the BLIP-2
|
||||
[Salesforce/blip2-opt-2.7b](https://huggingface.co/Salesforce/blip2-opt-2.7b) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
hidden_size (`int`, *optional*, defaults to 1408):
|
||||
@ -107,14 +107,14 @@ class Blip2VisionConfig(PretrainedConfig):
|
||||
self.qkv_bias = qkv_bias
|
||||
|
||||
|
||||
class Blip2QFormerConfig(PretrainedConfig):
|
||||
class Blip2QFormerConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`Blip2QFormerModel`]. It is used to instantiate a
|
||||
BLIP-2 Querying Transformer (Q-Former) model according to the specified arguments, defining the model architecture.
|
||||
Instantiating a configuration with the defaults will yield a similar configuration to that of the BLIP-2
|
||||
[Salesforce/blip2-opt-2.7b](https://huggingface.co/Salesforce/blip2-opt-2.7b) architecture. Configuration objects
|
||||
inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the documentation from
|
||||
[`PretrainedConfig`] for more information.
|
||||
inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the documentation from
|
||||
[`PreTrainedConfig`] for more information.
|
||||
|
||||
Note that [`Blip2QFormerModel`] is very similar to [`BertLMHeadModel`] with interleaved cross-attention.
|
||||
|
||||
@ -215,15 +215,15 @@ class Blip2QFormerConfig(PretrainedConfig):
|
||||
self.use_qformer_text_input = use_qformer_text_input
|
||||
|
||||
|
||||
class Blip2Config(PretrainedConfig):
|
||||
class Blip2Config(PreTrainedConfig):
|
||||
r"""
|
||||
[`Blip2Config`] is the configuration class to store the configuration of a [`Blip2ForConditionalGeneration`]. It is
|
||||
used to instantiate a BLIP-2 model according to the specified arguments, defining the vision model, Q-Former model
|
||||
and language model configs. Instantiating a configuration with the defaults will yield a similar configuration to
|
||||
that of the BLIP-2 [Salesforce/blip2-opt-2.7b](https://huggingface.co/Salesforce/blip2-opt-2.7b) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vision_config (`dict`, *optional*):
|
||||
@ -231,7 +231,7 @@ class Blip2Config(PretrainedConfig):
|
||||
qformer_config (`dict`, *optional*):
|
||||
Dictionary of configuration options used to initialize [`Blip2QFormerConfig`].
|
||||
text_config (`dict`, *optional*):
|
||||
Dictionary of configuration options used to initialize any [`PretrainedConfig`].
|
||||
Dictionary of configuration options used to initialize any [`PreTrainedConfig`].
|
||||
num_query_tokens (`int`, *optional*, defaults to 32):
|
||||
The number of query tokens passed through the Transformer.
|
||||
image_text_hidden_size (`int`, *optional*, defaults to 256):
|
||||
@ -262,7 +262,7 @@ class Blip2Config(PretrainedConfig):
|
||||
>>> # Accessing the model configuration
|
||||
>>> configuration = model.config
|
||||
|
||||
>>> # We can also initialize a Blip2Config from a Blip2VisionConfig, Blip2QFormerConfig and any PretrainedConfig
|
||||
>>> # We can also initialize a Blip2Config from a Blip2VisionConfig, Blip2QFormerConfig and any PreTrainedConfig
|
||||
|
||||
>>> # Initializing BLIP-2 vision, BLIP-2 Q-Former and language model configurations
|
||||
>>> vision_config = Blip2VisionConfig()
|
||||
@ -321,7 +321,7 @@ class Blip2Config(PretrainedConfig):
|
||||
cls,
|
||||
vision_config: Blip2VisionConfig,
|
||||
qformer_config: Blip2QFormerConfig,
|
||||
text_config: Optional[PretrainedConfig] = None,
|
||||
text_config: Optional[PreTrainedConfig] = None,
|
||||
**kwargs,
|
||||
):
|
||||
r"""
|
||||
@ -334,7 +334,7 @@ class Blip2Config(PretrainedConfig):
|
||||
qformer_config (`dict`):
|
||||
Dictionary of configuration options used to initialize [`Blip2QFormerConfig`].
|
||||
text_config (`dict`, *optional*):
|
||||
Dictionary of configuration options used to initialize any [`PretrainedConfig`].
|
||||
Dictionary of configuration options used to initialize any [`PreTrainedConfig`].
|
||||
|
||||
Returns:
|
||||
[`Blip2Config`]: An instance of a configuration object
|
||||
|
@ -24,7 +24,7 @@ from packaging import version
|
||||
if TYPE_CHECKING:
|
||||
from ... import PreTrainedTokenizer
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...onnx import OnnxConfigWithPast, PatchingSpec
|
||||
from ...utils import is_torch_available, logging
|
||||
|
||||
@ -32,15 +32,15 @@ from ...utils import is_torch_available, logging
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BloomConfig(PretrainedConfig):
|
||||
class BloomConfig(PreTrainedConfig):
|
||||
"""
|
||||
This is the configuration class to store the configuration of a [`BloomModel`]. It is used to instantiate a Bloom
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
defaults will yield a similar configuration to the Bloom architecture
|
||||
[bigscience/bloom](https://huggingface.co/bigscience/bloom).
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
@ -147,7 +147,7 @@ class BloomOnnxConfig(OnnxConfigWithPast):
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
config: PretrainedConfig,
|
||||
config: PreTrainedConfig,
|
||||
task: str = "default",
|
||||
patching_specs: Optional[list[PatchingSpec]] = None,
|
||||
use_past: bool = False,
|
||||
|
@ -14,14 +14,14 @@
|
||||
# limitations under the License.
|
||||
"""Blt model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BltLocalEncoderConfig(PretrainedConfig):
|
||||
class BltLocalEncoderConfig(PreTrainedConfig):
|
||||
"""
|
||||
Configuration class for the Blt Local Encoder component.
|
||||
"""
|
||||
@ -71,7 +71,7 @@ class BltLocalEncoderConfig(PretrainedConfig):
|
||||
super().__init__(**kwargs, tie_word_embeddings=False)
|
||||
|
||||
|
||||
class BltLocalDecoderConfig(PretrainedConfig):
|
||||
class BltLocalDecoderConfig(PreTrainedConfig):
|
||||
"""
|
||||
Configuration class for the Blt Local Decoder component.
|
||||
"""
|
||||
@ -121,7 +121,7 @@ class BltLocalDecoderConfig(PretrainedConfig):
|
||||
super().__init__(**kwargs, tie_word_embeddings=False)
|
||||
|
||||
|
||||
class BltGlobalTransformerConfig(PretrainedConfig):
|
||||
class BltGlobalTransformerConfig(PreTrainedConfig):
|
||||
"""
|
||||
Configuration class for the Blt Global Transformer component.
|
||||
"""
|
||||
@ -163,7 +163,7 @@ class BltGlobalTransformerConfig(PretrainedConfig):
|
||||
super().__init__(**kwargs, tie_word_embeddings=False)
|
||||
|
||||
|
||||
class BltPatcherConfig(PretrainedConfig):
|
||||
class BltPatcherConfig(PreTrainedConfig):
|
||||
r"""
|
||||
Configuration class for the Blt Patcher/Entropy model component.
|
||||
|
||||
@ -239,13 +239,13 @@ class BltPatcherConfig(PretrainedConfig):
|
||||
super().__init__(**kwargs, tie_word_embeddings=False)
|
||||
|
||||
|
||||
class BltConfig(PretrainedConfig):
|
||||
class BltConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BltModel`]. It is used to instantiate a
|
||||
Blt model according to the specified arguments, defining the model architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 260):
|
||||
|
@ -14,21 +14,21 @@
|
||||
# limitations under the License.
|
||||
"""BridgeTower model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BridgeTowerVisionConfig(PretrainedConfig):
|
||||
class BridgeTowerVisionConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the vision configuration of a [`BridgeTowerModel`]. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the bridgetower-base
|
||||
[BridgeTower/bridgetower-base](https://huggingface.co/BridgeTower/bridgetower-base/) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
hidden_size (`int`, *optional*, defaults to 768):
|
||||
@ -94,15 +94,15 @@ class BridgeTowerVisionConfig(PretrainedConfig):
|
||||
self.remove_last_layer = remove_last_layer
|
||||
|
||||
|
||||
class BridgeTowerTextConfig(PretrainedConfig):
|
||||
class BridgeTowerTextConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the text configuration of a [`BridgeTowerModel`]. The default values here
|
||||
are copied from RoBERTa. Instantiating a configuration with the defaults will yield a similar configuration to that
|
||||
of the bridgetower-base [BridegTower/bridgetower-base](https://huggingface.co/BridgeTower/bridgetower-base/)
|
||||
architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 50265):
|
||||
@ -202,15 +202,15 @@ class BridgeTowerTextConfig(PretrainedConfig):
|
||||
self.eos_token_id = eos_token_id
|
||||
|
||||
|
||||
class BridgeTowerConfig(PretrainedConfig):
|
||||
class BridgeTowerConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BridgeTowerModel`]. It is used to instantiate a
|
||||
BridgeTower model according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the bridgetower-base
|
||||
[BridgeTower/bridgetower-base](https://huggingface.co/BridgeTower/bridgetower-base/) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
share_cross_modal_transformer_layers (`bool`, *optional*, defaults to `True`):
|
||||
|
@ -14,22 +14,22 @@
|
||||
# limitations under the License.
|
||||
"""Bros model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class BrosConfig(PretrainedConfig):
|
||||
class BrosConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`BrosModel`] or a [`TFBrosModel`]. It is used to
|
||||
instantiate a Bros model according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the Bros
|
||||
[jinho8345/bros-base-uncased](https://huggingface.co/jinho8345/bros-base-uncased) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
vocab_size (`int`, *optional*, defaults to 30522):
|
||||
|
@ -18,7 +18,7 @@
|
||||
from collections import OrderedDict
|
||||
from collections.abc import Mapping
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...onnx import OnnxConfig
|
||||
from ...utils import logging
|
||||
|
||||
@ -26,15 +26,15 @@ from ...utils import logging
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class CamembertConfig(PretrainedConfig):
|
||||
class CamembertConfig(PreTrainedConfig):
|
||||
"""
|
||||
This is the configuration class to store the configuration of a [`CamembertModel`] or a [`TFCamembertModel`]. It is
|
||||
used to instantiate a Camembert model according to the specified arguments, defining the model architecture.
|
||||
Instantiating a configuration with the defaults will yield a similar configuration to that of the Camembert
|
||||
[almanach/camembert-base](https://huggingface.co/almanach/camembert-base) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -14,22 +14,22 @@
|
||||
# limitations under the License.
|
||||
"""CANINE model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class CanineConfig(PretrainedConfig):
|
||||
class CanineConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`CanineModel`]. It is used to instantiate an
|
||||
CANINE model according to the specified arguments, defining the model architecture. Instantiating a configuration
|
||||
with the defaults will yield a similar configuration to that of the CANINE
|
||||
[google/canine-s](https://huggingface.co/google/canine-s) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -16,19 +16,19 @@
|
||||
|
||||
from typing import Optional
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class ChameleonVQVAEConfig(PretrainedConfig):
|
||||
class ChameleonVQVAEConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`ChameleonVQModel`]. It is used to instantiate a
|
||||
`ChameleonVQModel` according to the specified arguments, defining the model architecture.
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information. Instantiating a
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to the VQModel of the
|
||||
[meta/chameleon-7B](https://huggingface.co/meta/chameleon-7B).
|
||||
|
||||
@ -97,15 +97,15 @@ class ChameleonVQVAEConfig(PretrainedConfig):
|
||||
self.initializer_range = initializer_range
|
||||
|
||||
|
||||
class ChameleonConfig(PretrainedConfig):
|
||||
class ChameleonConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`ChameleonModel`]. It is used to instantiate a
|
||||
chameleon model according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the
|
||||
[meta/chameleon-7B](https://huggingface.co/meta/chameleon-7B).
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
|
@ -22,7 +22,7 @@ from typing import TYPE_CHECKING, Any
|
||||
if TYPE_CHECKING:
|
||||
from ...processing_utils import ProcessorMixin
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...onnx import OnnxConfig
|
||||
from ...utils import logging
|
||||
|
||||
@ -30,7 +30,7 @@ from ...utils import logging
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class ChineseCLIPTextConfig(PretrainedConfig):
|
||||
class ChineseCLIPTextConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`ChineseCLIPModel`]. It is used to instantiate a
|
||||
Chinese CLIP model according to the specified arguments, defining the model architecture. Instantiating a
|
||||
@ -38,8 +38,8 @@ class ChineseCLIPTextConfig(PretrainedConfig):
|
||||
[OFA-Sys/chinese-clip-vit-base-patch16](https:
|
||||
//huggingface.co/OFA-Sys/chinese-clip-vit-base-patch16) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
@ -142,15 +142,15 @@ class ChineseCLIPTextConfig(PretrainedConfig):
|
||||
self.use_cache = use_cache
|
||||
|
||||
|
||||
class ChineseCLIPVisionConfig(PretrainedConfig):
|
||||
class ChineseCLIPVisionConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`ChineseCLIPModel`]. It is used to instantiate an
|
||||
ChineseCLIP model according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the ChineseCLIP
|
||||
[OFA-Sys/chinese-clip-vit-base-patch16](https://huggingface.co/OFA-Sys/chinese-clip-vit-base-patch16) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
@ -233,7 +233,7 @@ class ChineseCLIPVisionConfig(PretrainedConfig):
|
||||
self.hidden_act = hidden_act
|
||||
|
||||
|
||||
class ChineseCLIPConfig(PretrainedConfig):
|
||||
class ChineseCLIPConfig(PreTrainedConfig):
|
||||
r"""
|
||||
[`ChineseCLIPConfig`] is the configuration class to store the configuration of a [`ChineseCLIPModel`]. It is used
|
||||
to instantiate Chinese-CLIP model according to the specified arguments, defining the text model and vision model
|
||||
@ -241,8 +241,8 @@ class ChineseCLIPConfig(PretrainedConfig):
|
||||
Chinese-CLIP [OFA-Sys/chinese-clip-vit-base-patch16](https://huggingface.co/OFA-Sys/chinese-clip-vit-base-patch16)
|
||||
architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
text_config (`dict`, *optional*):
|
||||
|
@ -14,22 +14,22 @@
|
||||
# limitations under the License.
|
||||
"""CLAP model configuration"""
|
||||
|
||||
from ...configuration_utils import PretrainedConfig
|
||||
from ...configuration_utils import PreTrainedConfig
|
||||
from ...utils import logging
|
||||
|
||||
|
||||
logger = logging.get_logger(__name__)
|
||||
|
||||
|
||||
class ClapTextConfig(PretrainedConfig):
|
||||
class ClapTextConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`ClapTextModel`]. It is used to instantiate a CLAP
|
||||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the
|
||||
defaults will yield a similar configuration to that of the CLAP
|
||||
[calp-hsat-fused](https://huggingface.co/laion/clap-hsat-fused) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
|
||||
Args:
|
||||
@ -136,15 +136,15 @@ class ClapTextConfig(PretrainedConfig):
|
||||
self.projection_dim = projection_dim
|
||||
|
||||
|
||||
class ClapAudioConfig(PretrainedConfig):
|
||||
class ClapAudioConfig(PreTrainedConfig):
|
||||
r"""
|
||||
This is the configuration class to store the configuration of a [`ClapAudioModel`]. It is used to instantiate a
|
||||
CLAP audio encoder according to the specified arguments, defining the model architecture. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the audio encoder of the CLAP
|
||||
[laion/clap-htsat-fused](https://huggingface.co/laion/clap-htsat-fused) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
window_size (`int`, *optional*, defaults to 8):
|
||||
@ -289,15 +289,15 @@ class ClapAudioConfig(PretrainedConfig):
|
||||
self.projection_hidden_act = projection_hidden_act
|
||||
|
||||
|
||||
class ClapConfig(PretrainedConfig):
|
||||
class ClapConfig(PreTrainedConfig):
|
||||
r"""
|
||||
[`ClapConfig`] is the configuration class to store the configuration of a [`ClapModel`]. It is used to instantiate
|
||||
a CLAP model according to the specified arguments, defining the text model and audio model configs. Instantiating a
|
||||
configuration with the defaults will yield a similar configuration to that of the CLAP
|
||||
[laion/clap-htsat-fused](https://huggingface.co/laion/clap-htsat-fused) architecture.
|
||||
|
||||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PretrainedConfig`] for more information.
|
||||
Configuration objects inherit from [`PreTrainedConfig`] and can be used to control the model outputs. Read the
|
||||
documentation from [`PreTrainedConfig`] for more information.
|
||||
|
||||
Args:
|
||||
text_config (`dict`, *optional*):
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user