2nd try

Empty commit
[doc] Try a few ≠ ways of linking to Papers, users, and org profiles
2025-10-20 17:13:56 +08:00 · 2023-04-06 12:33:38 -04:00 · 2023-04-06 14:56:17 +02:00 · 2023-04-06 12:51:16 +02:00
4 changed files with 14 additions and 5 deletions
--- a/docs/source/en/model_doc/distilbert.mdx
+++ b/docs/source/en/model_doc/distilbert.mdx
@ -19,13 +19,16 @@ specific language governing permissions and limitations under the License.
 <a href="https://huggingface.co/spaces/docs-demos/distilbert-base-uncased">
 <img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
 </a>
+<a href="https://huggingface.co/papers/1910.01108">
+<img alt="Paper page" src="https://img.shields.io/badge/Paper%20page-1910.01108-green">
+</a>
 </div>

 ## Overview

 The DistilBERT model was proposed in the blog post [Smaller, faster, cheaper, lighter: Introducing DistilBERT, a
 distilled version of BERT](https://medium.com/huggingface/distilbert-8cf3380435b5), and the paper [DistilBERT, a
-distilled version of BERT: smaller, faster, cheaper and lighter](https://arxiv.org/abs/1910.01108). DistilBERT is a
+distilled version of BERT: smaller, faster, cheaper and lighter](https://arxiv.org/papers/1910.01108). DistilBERT is a
 small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than
 *bert-base-uncased*, runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language
 understanding benchmark.
--- a/docs/source/en/model_doc/gpt2.mdx
+++ b/docs/source/en/model_doc/gpt2.mdx
@ -24,7 +24,7 @@ specific language governing permissions and limitations under the License.
 ## Overview

 OpenAI GPT-2 model was proposed in [Language Models are Unsupervised Multitask Learners](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) by Alec
-Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei and Ilya Sutskever. It's a causal (unidirectional)
+Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei and Ilya Sutskever from [OpenAI](https://huggingface.co/openai). It's a causal (unidirectional)
 transformer pretrained using language modeling on a very large corpus of ~40 GB of text data.

 The abstract from the paper is the following:
--- a/docs/source/en/model_doc/roberta.mdx
+++ b/docs/source/en/model_doc/roberta.mdx
@ -19,11 +19,14 @@ specific language governing permissions and limitations under the License.
 <a href="https://huggingface.co/spaces/docs-demos/roberta-base">
 <img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
 </a>
+<a href="https://huggingface.co/papers/1907.11692">
+<img alt="Paper page" src="https://img.shields.io/badge/Paper%20page-1907.11692-green">
+</a>
 </div>

 ## Overview

-The RoBERTa model was proposed in [RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692) by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer
+The RoBERTa model was proposed in [RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692) by Yinhan Liu, [Myle Ott](https://huggingface.co/myleott), Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer
 Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. It is based on Google's BERT model released in 2018.

 It builds on BERT and modifies key hyperparameters, removing the next-sentence pretraining objective and training with
--- a/docs/source/en/model_doc/t5.mdx
+++ b/docs/source/en/model_doc/t5.mdx
@ -19,12 +19,15 @@ specific language governing permissions and limitations under the License.
 <a href="https://huggingface.co/spaces/docs-demos/t5-base">
 <img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
 </a>
+<a href="https://huggingface.co/papers/1910.10683">
+<img alt="Paper page" src="https://img.shields.io/badge/Paper%20page-1910.10683-green">
+</a>
 </div>

 ## Overview

-The T5 model was presented in [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf) by Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang,
-Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu.
+The T5 model was presented in [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf) by [Colin Raffel](https://huggingface.co/craffel), Noam Shazeer, [Adam Roberts](https://huggingface.co/adarob), Katherine Lee, Sharan Narang,
+Michael Matena, Yanqi Zhou, Wei Li, [Peter J. Liu](https://huggingface.co/peterjliu).

 The abstract from the paper is the following:
Author	SHA1	Message	Date
Lysandre	a0ccf540c4	2nd try	2023-04-06 12:33:38 -04:00
Julien Chaumond	92897be543	Empty commit	2023-04-06 14:56:17 +02:00
Julien Chaumond	4a6c4508c8	[doc] Try a few ≠ ways of linking to Papers, users, and org profiles	2023-04-06 12:51:16 +02:00