mirror of
https://github.com/vllm-project/vllm.git
synced 2025-10-20 14:53:52 +08:00
[Docs] Improve API docs (+small tweaks) (#22459)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
@ -58,10 +58,9 @@ nav:
|
||||
- CI: contributing/ci
|
||||
- Design Documents: design
|
||||
- API Reference:
|
||||
- Summary: api/README.md
|
||||
- Summary: api/summary.md
|
||||
- Contents:
|
||||
- glob: api/vllm/*
|
||||
preserve_directory_names: true
|
||||
- api/vllm/*
|
||||
- CLI Reference:
|
||||
- Summary: cli/README.md
|
||||
- Community:
|
||||
|
@ -1,7 +1,4 @@
|
||||
---
|
||||
title: FP8 INC
|
||||
---
|
||||
[](){ #inc }
|
||||
# FP8 INC
|
||||
|
||||
vLLM supports FP8 (8-bit floating point) weight and activation quantization using Intel® Neural Compressor (INC) on Intel® Gaudi® 2 and Intel® Gaudi® 3 AI accelerators.
|
||||
Currently, quantization is validated only in Llama models.
|
||||
|
@ -105,7 +105,7 @@ class Example:
|
||||
return fix_case(self.path.stem.replace("_", " ").title())
|
||||
|
||||
def generate(self) -> str:
|
||||
content = f"---\ntitle: {self.title}\n---\n\n"
|
||||
content = f"# {self.title}\n\n"
|
||||
content += f"Source <gh-file:{self.path.relative_to(ROOT_DIR)}>.\n\n"
|
||||
|
||||
# Use long code fence to avoid issues with
|
||||
|
@ -40,6 +40,7 @@ theme:
|
||||
- navigation.sections
|
||||
- navigation.prune
|
||||
- navigation.top
|
||||
- navigation.indexes
|
||||
- search.highlight
|
||||
- search.share
|
||||
- toc.follow
|
||||
@ -51,11 +52,6 @@ hooks:
|
||||
- docs/mkdocs/hooks/generate_argparse.py
|
||||
- docs/mkdocs/hooks/url_schemes.py
|
||||
|
||||
# Required to stop api-autonav from raising an error
|
||||
# https://github.com/tlambert03/mkdocs-api-autonav/issues/16
|
||||
nav:
|
||||
- api
|
||||
|
||||
plugins:
|
||||
- meta
|
||||
- search
|
||||
|
Reference in New Issue
Block a user