mirror of https://github.com/huggingface/peft.git synced 2025-10-20 15:33:48 +08:00

Files

Eric Buehler 4c3a76fa68 FIX DOC Update X-LoRA docs, some bugfixes (#2002 )

Bugs with dtype and loading of LoRA adapters.

2024-08-15 15:29:32 +02:00

README.md

FIX DOC Update X-LoRA docs, some bugfixes (#2002 )

2024-08-15 15:29:32 +02:00

xlora_inference_mistralrs.py

FIX DOC Update X-LoRA docs, some bugfixes (#2002 )

2024-08-15 15:29:32 +02:00

README.md

X-LoRA examples

`xlora_inference_mistralrs.py`

Perform inference of an X-LoRA model using the inference engine mistral.rs.

Mistral.rs supports many base models besides Mistral, and can load models directly from saved LoRA checkpoints. Check out adapter model docs and the models support matrix.

Mistral.rs features X-LoRA support and incorporates techniques such as a dual-KV cache, continuous batching, Paged Attention, and optional non granular scalings, will allow vastly improved throughput.

Links:

Installation: https://github.com/EricLBuehler/mistral.rs/blob/master/mistralrs-pyo3/README.md
Runnable example: https://github.com/EricLBuehler/mistral.rs/blob/master/examples/python/xlora_zephyr.py
Adapter model docs and making the ordering file: https://github.com/EricLBuehler/mistral.rs/blob/master/docs/ADAPTER_MODELS.md