2.7 KiB
This model was released on 2020-10-20 and added to Hugging Face Transformers on 2023-06-20 and contributed by stefan-it.
Warning
This model is in maintenance mode only, we do not accept any new PRs changing its code.
If you run into any issues running this model, please reinstall the last version that supported this model: v4.30.0. You can do so by running the following command: pip install -U transformers==4.30.0.
BORT
BORT extracts an optimal subset of architectural parameters from BERT, significantly reducing its size to 5.5% of BERT-large's effective size and 16% of its net size. BORT can be pretrained in 288 GPU hours, which is 1.2% of the time required for RoBERTa-large and 33% of BERT-large. It is 7.9x faster on a CPU and outperforms other compressed and some non-compressed variants, achieving performance improvements of 0.3% to 31% on various NLU benchmarks.
import torch
from transformers import pipeline
pipeline = pipeline(task="fill-mask", model="amazon/bort", dtype="auto")
pipeline("Plants create [MASK] through a process known as photosynthesis.")
import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer
model = AutoModelForMaskedLM.from_pretrained("amazon/bort", dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("amazon/bort")
inputs = tokenizer("Plants create [MASK] through a process known as photosynthesis.", return_tensors="pt")
outputs = model(**inputs)
mask_token_id = tokenizer.mask_token_id
mask_position = (inputs.input_ids == tokenizer.mask_token_id).nonzero(as_tuple=True)[1]
predicted_word = tokenizer.decode(outputs.logits[0, mask_position].argmax(dim=-1))
print(f"Predicted word: {predicted_word}")
Usage tips
- BORT uses the RoBERTa tokenizer instead of the BERT tokenizer. Check RoBERTa's documentation for API reference and usage examples.