Files
transformers/docs/source/en/model_doc/longformer.md
2025-10-15 14:08:54 -07:00

6.2 KiB

This model was released on 2020-04-10 and added to Hugging Face Transformers on 2020-11-16 and contributed by beltagy.

Longformer

Longformer: The Long-Document Transformer introduces an attention mechanism that scales linearly with sequence length, enabling the processing of very long documents. This mechanism combines local windowed attention with task-specific global attention, replacing standard self-attention. Longformer achieves state-of-the-art results in character-level language modeling on text8 and enwik8. When pretrained and fine-tuned, it outperforms RoBERTa on long document tasks, setting new benchmarks on WikiHop and TriviaQA.

import torch
from transformers import pipeline

pipeline = pipeline(task="fill-mask", model="allenai/longformer-base-4096", dtype="auto")
pipeline("Plants are among the most <mask> and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet. Through photosynthesis, plants capture energy from sunlight using a green pigment called chlorophyll, which is located in specialized cell structures called chloroplasts. In the presence of light, plants absorb carbon dioxide from the atmosphere through small pores in their leaves called stomata, and take in water from the soil through their root systems. These ingredients are then transformed into glucose, a type of sugar that serves as a source of chemical energy, and oxygen, which is released as a byproduct into the atmosphere. The glucose produced during photosynthesis is not just used immediately; plants also store it as starch or convert it into other organic compounds like cellulose, which is essential for building their cellular structure.")
import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer

model = AutoModelForMaskedLM.from_pretrained("allenai/longformer-base-4096", dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("allenai/longformer-base-4096")

text="""
Plants are among the most <mask> and essential life forms on Earth, possessing a unique ability to produce their own food through a process known as photosynthesis. This complex biochemical process is fundamental not only to plant life but to virtually all life on the planet.
Through photosynthesis, plants capture energy from sunlight using a green pigment called chlorophyll, which is located in specialized cell structures called chloroplasts. In the presence of light, plants absorb carbon dioxide from the atmosphere through small pores in their leaves called stomata, and take in water from the soil through their root systems.
These ingredients are then transformed into glucose, a type of sugar that serves as a source of chemical energy, and oxygen, which is released as a byproduct into the atmosphere. The glucose produced during photosynthesis is not just used immediately; plants also store it as starch or convert it into other organic compounds like cellulose, which is essential for building their cellular structure.
"""
input_ids = tokenizer([text], return_tensors="pt")["input_ids"]
logits = model(input_ids).logits
print(tokenizer.decode(logits[0, (input_ids[0] == tokenizer.mask_token_id).nonzero().item()].argmax()))

Usage tips

  • Longformer is based on RoBERTa and doesn't have token_type_ids. You don't need to indicate which token belongs to which segment. Just separate segments with the separation token </s> or tokenizer.sep_token.
  • Set which tokens attend locally and which attend globally with the global_attention_mask at inference. A value of 0 means a token attends locally. A value of 1 means a token attends globally.
  • [LongformerForMaskedLM] is trained like [RobertaForMaskedLM] and should be similarly.

LongformerConfig

autodoc LongformerConfig

LongformerTokenizer

autodoc LongformerTokenizer

LongformerTokenizerFast

autodoc LongformerTokenizerFast

Longformer specific outputs

autodoc models.longformer.modeling_longformer.LongformerBaseModelOutput

autodoc models.longformer.modeling_longformer.LongformerBaseModelOutputWithPooling

autodoc models.longformer.modeling_longformer.LongformerMaskedLMOutput

autodoc models.longformer.modeling_longformer.LongformerQuestionAnsweringModelOutput

autodoc models.longformer.modeling_longformer.LongformerSequenceClassifierOutput

autodoc models.longformer.modeling_longformer.LongformerMultipleChoiceModelOutput

autodoc models.longformer.modeling_longformer.LongformerTokenClassifierOutput

] models.longformer.modeling_tf_longformer.TFLongformerBaseModelOutputWithPooling

] models.longformer.modeling_tf_longformer.TFLongformerQuestionAnsweringModelOutput

] models.longformer.modeling_tf_longformer.TFLongformerMultipleChoiceModelOutput

LongformerModel

autodoc LongformerModel - forward

LongformerForMaskedLM

autodoc LongformerForMaskedLM - forward

LongformerForSequenceClassification

autodoc LongformerForSequenceClassification - forward

LongformerForMultipleChoice

autodoc LongformerForMultipleChoice - forward

LongformerForTokenClassification

autodoc LongformerForTokenClassification - forward

LongformerForQuestionAnswering

autodoc LongformerForQuestionAnswering - forward