mirror of
https://github.com/huggingface/transformers.git
synced 2025-10-20 17:13:56 +08:00
docs: update LightGlue docs (#39407)
* docs: update LightGlue docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
This commit is contained in:
@ -10,37 +10,31 @@ specific language governing permissions and limitations under the License.
|
||||
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
|
||||
rendered properly in your Markdown viewer.
|
||||
|
||||
|
||||
-->
|
||||
|
||||
<div style="float: right;">
|
||||
<div class="flex flex-wrap space-x-1">
|
||||
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white" >
|
||||
</div>
|
||||
</div>
|
||||
|
||||
# LightGlue
|
||||
|
||||
## Overview
|
||||
[LightGlue](https://arxiv.org/abs/2306.13643) is a deep neural network that learns to match local features across images. It revisits multiple design decisions of SuperGlue and derives simple but effective improvements. Cumulatively, these improvements make LightGlue more efficient - in terms of both memory and computation, more accurate, and much easier to train. Similar to [SuperGlue](https://huggingface.co/magic-leap-community/superglue_outdoor), this model consists of matching two sets of local features extracted from two images, with the goal of being faster than SuperGlue. Paired with the [SuperPoint model](https://huggingface.co/magic-leap-community/superpoint), it can be used to match two images and estimate the pose between them.
|
||||
|
||||
The LightGlue model was proposed in [LightGlue: Local Feature Matching at Light Speed](https://arxiv.org/abs/2306.13643)
|
||||
by Philipp Lindenberger, Paul-Edouard Sarlin and Marc Pollefeys.
|
||||
You can find all the original LightGlue checkpoints under the [ETH-CVG](https://huggingface.co/ETH-CVG) organization.
|
||||
|
||||
Similar to [SuperGlue](https://huggingface.co/magic-leap-community/superglue_outdoor), this model consists of matching
|
||||
two sets of local features extracted from two images, its goal is to be faster than SuperGlue. Paired with the
|
||||
[SuperPoint model](https://huggingface.co/magic-leap-community/superpoint), it can be used to match two images and
|
||||
estimate the pose between them. This model is useful for tasks such as image matching, homography estimation, etc.
|
||||
> [!TIP]
|
||||
> This model was contributed by [stevenbucaille](https://huggingface.co/stevenbucaille).
|
||||
>
|
||||
> Click on the LightGlue models in the right sidebar for more examples of how to apply LightGlue to different computer vision tasks.
|
||||
|
||||
The abstract from the paper is the following:
|
||||
The example below demonstrates how to match keypoints between two images with the [`AutoModel`] class.
|
||||
|
||||
*We introduce LightGlue, a deep neural network that learns to match local features across images. We revisit multiple
|
||||
design decisions of SuperGlue, the state of the art in sparse matching, and derive simple but effective improvements.
|
||||
Cumulatively, they make LightGlue more efficient - in terms of both memory and computation, more accurate, and much
|
||||
easier to train. One key property is that LightGlue is adaptive to the difficulty of the problem: the inference is much
|
||||
faster on image pairs that are intuitively easy to match, for example because of a larger visual overlap or limited
|
||||
appearance change. This opens up exciting prospects for deploying deep matchers in latency-sensitive applications like
|
||||
3D reconstruction. The code and trained models are publicly available at this [https URL](https://github.com/cvg/LightGlue)*
|
||||
<hfoptions id="usage">
|
||||
<hfoption id="AutoModel">
|
||||
|
||||
## How to use
|
||||
|
||||
Here is a quick example of using the model. Since this model is an image matching model, it requires pairs of images to be matched.
|
||||
The raw outputs contain the list of keypoints detected by the keypoint detector as well as the list of matches with their corresponding
|
||||
matching scores.
|
||||
```python
|
||||
```py
|
||||
from transformers import AutoImageProcessor, AutoModel
|
||||
import torch
|
||||
from PIL import Image
|
||||
@ -59,31 +53,70 @@ model = AutoModel.from_pretrained("ETH-CVG/lightglue_superpoint")
|
||||
inputs = processor(images, return_tensors="pt")
|
||||
with torch.no_grad():
|
||||
outputs = model(**inputs)
|
||||
```
|
||||
|
||||
You can use the `post_process_keypoint_matching` method from the `LightGlueImageProcessor` to get the keypoints and matches in a readable format:
|
||||
```python
|
||||
# Post-process to get keypoints and matches
|
||||
image_sizes = [[(image.height, image.width) for image in images]]
|
||||
outputs = processor.post_process_keypoint_matching(outputs, image_sizes, threshold=0.2)
|
||||
for i, output in enumerate(outputs):
|
||||
print("For the image pair", i)
|
||||
for keypoint0, keypoint1, matching_score in zip(
|
||||
output["keypoints0"], output["keypoints1"], output["matching_scores"]
|
||||
):
|
||||
print(
|
||||
f"Keypoint at coordinate {keypoint0.numpy()} in the first image matches with keypoint at coordinate {keypoint1.numpy()} in the second image with a score of {matching_score}."
|
||||
)
|
||||
processed_outputs = processor.post_process_keypoint_matching(outputs, image_sizes, threshold=0.2)
|
||||
```
|
||||
|
||||
You can visualize the matches between the images by providing the original images as well as the outputs to this method:
|
||||
```python
|
||||
processor.plot_keypoint_matching(images, outputs)
|
||||
```
|
||||
</hfoption>
|
||||
</hfoptions>
|
||||
|
||||

|
||||
## Notes
|
||||
|
||||
This model was contributed by [stevenbucaille](https://huggingface.co/stevenbucaille).
|
||||
The original code can be found [here](https://github.com/cvg/LightGlue).
|
||||
- LightGlue is adaptive to the task difficulty. Inference is much faster on image pairs that are intuitively easy to match, for example, because of a larger visual overlap or limited appearance change.
|
||||
|
||||
```py
|
||||
from transformers import AutoImageProcessor, AutoModel
|
||||
import torch
|
||||
from PIL import Image
|
||||
import requests
|
||||
|
||||
processor = AutoImageProcessor.from_pretrained("ETH-CVG/lightglue_superpoint")
|
||||
model = AutoModel.from_pretrained("ETH-CVG/lightglue_superpoint")
|
||||
|
||||
# LightGlue requires pairs of images
|
||||
images = [image1, image2]
|
||||
inputs = processor(images, return_tensors="pt")
|
||||
outputs = model(**inputs)
|
||||
|
||||
# Extract matching information
|
||||
keypoints0 = outputs.keypoints0 # Keypoints in first image
|
||||
keypoints1 = outputs.keypoints1 # Keypoints in second image
|
||||
matches = outputs.matches # Matching indices
|
||||
matching_scores = outputs.matching_scores # Confidence scores
|
||||
```
|
||||
|
||||
- The model outputs matching indices, keypoints, and confidence scores for each match, similar to SuperGlue but with improved efficiency.
|
||||
- For better visualization and analysis, use the [`LightGlueImageProcessor.post_process_keypoint_matching`] method to get matches in a more readable format.
|
||||
|
||||
```py
|
||||
# Process outputs for visualization
|
||||
image_sizes = [[(image.height, image.width) for image in images]]
|
||||
processed_outputs = processor.post_process_keypoint_matching(outputs, image_sizes, threshold=0.2)
|
||||
|
||||
for i, output in enumerate(processed_outputs):
|
||||
print(f"For the image pair {i}")
|
||||
for keypoint0, keypoint1, matching_score in zip(
|
||||
output["keypoints0"], output["keypoints1"], output["matching_scores"]
|
||||
):
|
||||
print(f"Keypoint at {keypoint0.numpy()} matches with keypoint at {keypoint1.numpy()} with score {matching_score}")
|
||||
```
|
||||
|
||||
- Visualize the matches between the images using the built-in plotting functionality.
|
||||
|
||||
```py
|
||||
# Easy visualization using the built-in plotting method
|
||||
processor.plot_keypoint_matching(images, processed_outputs)
|
||||
```
|
||||
|
||||
<div class="flex justify-center">
|
||||
<img src="https://cdn-uploads.huggingface.co/production/uploads/632885ba1558dac67c440aa8/duPp09ty8NRZlMZS18ccP.png">
|
||||
</div>
|
||||
|
||||
## Resources
|
||||
|
||||
- Refer to the [original LightGlue repository](https://github.com/cvg/LightGlue) for more examples and implementation details.
|
||||
|
||||
## LightGlueConfig
|
||||
|
||||
@ -97,8 +130,13 @@ The original code can be found [here](https://github.com/cvg/LightGlue).
|
||||
- post_process_keypoint_matching
|
||||
- plot_keypoint_matching
|
||||
|
||||
<frameworkcontent>
|
||||
<pt>
|
||||
## LightGlueForKeypointMatching
|
||||
|
||||
[[autodoc]] LightGlueForKeypointMatching
|
||||
|
||||
- forward
|
||||
|
||||
</pt>
|
||||
</frameworkcontent>
|
||||
|
Reference in New Issue
Block a user