3.9 KiB
This model was released on 2020-07-28 and added to Hugging Face Transformers on 2021-03-30 and contributed by vasudevgupta.
BigBird
BigBird: Transformers for Longer Sequences introduces a sparse-attention mechanism that reduces the quadratic dependency on sequence length to linear, enabling handling of much longer sequences compared to models like BERT. BigBird combines sparse, global, and random attention to approximate full attention efficiently. This allows it to process sequences up to 8 times longer on similar hardware, improving performance on long document NLP tasks such as question answering and summarization. Additionally, the model supports novel applications in genomics.
import torch
from transformers import pipeline
pipeline = pipeline(task="fill-mask", model="google/bigbird-roberta-base", dtype="auto")
pipeline("Plants create [MASK] through a process known as photosynthesis.")
import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer
model = AutoModelForMaskedLM.from_pretrained("google/bigbird-roberta-base", dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("google/bigbird-roberta-base")
inputs = tokenizer("Plants create [MASK] through a process known as photosynthesis.", return_tensors="pt")
outputs = model(**inputs)
mask_token_id = tokenizer.mask_token_id
mask_position = (inputs.input_ids == tokenizer.mask_token_id).nonzero(as_tuple=True)[1]
predicted_word = tokenizer.decode(outputs.logits[0, mask_position].argmax(dim=-1))
print(f"Predicted word: {predicted_word}")
Usage tips
- Pad inputs on the right. BigBird uses absolute position embeddings.
- BigBird supports
original_full
andblock_sparse
attention. Useoriginal_full
for sequences under 1024 tokens since sparse patterns don't help much with smaller inputs. - Current implementation uses 3-block window size and 2 global blocks. It only supports ITC-implementation and doesn't support
num_random_blocks=0
. - Sequence length must be divisible by the block size.
BigBirdConfig
autodoc BigBirdConfig
BigBirdTokenizer
autodoc BigBirdTokenizer - build_inputs_with_special_tokens - get_special_tokens_mask - create_token_type_ids_from_sequences - save_vocabulary
BigBirdTokenizerFast
autodoc BigBirdTokenizerFast
BigBird specific outputs
autodoc models.big_bird.modeling_big_bird.BigBirdForPreTrainingOutput
BigBirdModel
autodoc BigBirdModel - forward
BigBirdForPreTraining
autodoc BigBirdForPreTraining - forward
BigBirdForCausalLM
autodoc BigBirdForCausalLM - forward
BigBirdForMaskedLM
autodoc BigBirdForMaskedLM - forward
BigBirdForSequenceClassification
autodoc BigBirdForSequenceClassification - forward
BigBirdForMultipleChoice
autodoc BigBirdForMultipleChoice - forward
BigBirdForTokenClassification
autodoc BigBirdForTokenClassification - forward
BigBirdForQuestionAnswering
autodoc BigBirdForQuestionAnswering - forward