From 277c059985e97dfe48780466f9cc9c6c4bf7a8aa Mon Sep 17 00:00:00 2001 From: sanjibani <18418553+sanjibani@users.noreply.github.com> Date: Mon, 22 Jun 2026 12:35:06 +0530 Subject: [PATCH] docs: fix broken vLLM engine arguments docs link (vllm.mdx:6) The link to vLLM's 'engine arguments documentation' points at https://docs.vllm.ai/en/latest/usage/engine_args.html which returns 404. vLLM reorganized the docs and moved the engine arguments page under /configuration/. Old: https://docs.vllm.ai/en/latest/usage/engine_args.html (404) New: https://docs.vllm.ai/en/latest/configuration/engine_args.html (200) Verified both URLs with curl. --- docs/customize/model-providers/more/vllm.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/customize/model-providers/more/vllm.mdx b/docs/customize/model-providers/more/vllm.mdx index 3d15982dbb7..9c9fce67ce4 100644 --- a/docs/customize/model-providers/more/vllm.mdx +++ b/docs/customize/model-providers/more/vllm.mdx @@ -3,7 +3,7 @@ title: "vLLM" description: "Configure vLLM's high-performance inference library with Continue for chat, autocomplete, and embeddings, including setup instructions for Llama3.1, Qwen2.5-Coder, and Nomic Embed models" --- -vLLM is an open-source library for fast LLM inference which typically is used to serve multiple users at the same time. It can also be used to run a large model on multiple GPU:s (e.g. when it doesn´t fit in a single GPU). Run their OpenAI-compatible server using `vllm serve`. See their [server documentation](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html) and the [engine arguments documentation](https://docs.vllm.ai/en/latest/usage/engine_args.html). +vLLM is an open-source library for fast LLM inference which typically is used to serve multiple users at the same time. It can also be used to run a large model on multiple GPU:s (e.g. when it doesn´t fit in a single GPU). Run their OpenAI-compatible server using `vllm serve`. See their [server documentation](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html) and the [engine arguments documentation](https://docs.vllm.ai/en/latest/configuration/engine_args.html). ```shell vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct