MyPrivateClaw

vLLM v0.18 Ships New 'vllm serve' CLI — Old Entrypoint Deprecated | Tool Update

vLLM v0.18 introduces a new top level CLI command — 'vllm serve' — replacing the old 'python m vllm.entrypoints.openai.api server' invocation, which is now dep…

Published on MyPrivateClaw

Mar 31, 2026, 6:50 AM UTC

Coverage date

Mar 20, 2026

Last updated

Apr 3, 2026, 2:34 PM UTC

News summary

vLLM v0.18 introduces a new top level CLI command — 'vllm serve' — replacing the old 'python m vllm.entrypoints.openai.api server' invocation, which is now deprecated. The release also adds gRPC serving support via a new grpc flag for high performance RPC based serving alongside the existing HTTP API. The recommended install command now uses 'uv pip install vllm torch backend=auto' for automatic PyTorch backend selection. Update your scripts and Docker images before upgrading.