MyPrivateClaw
vLLM v0.18 Ships New 'vllm serve' CLI — Old Entrypoint Deprecated | Tool Update
vLLM v0.18 introduces a new top level CLI command — 'vllm serve' — replacing the old 'python m vllm.entrypoints.openai.api server' invocation, which is now dep…
Published on MyPrivateClaw
Mar 31, 2026, 6:50 AM UTC
Coverage date
Mar 20, 2026
Last updated
Apr 3, 2026, 2:34 PM UTC
News summary
vLLM v0.18 introduces a new top level CLI command — 'vllm serve' — replacing the old 'python m vllm.entrypoints.openai.api server' invocation, which is now deprecated. The release also adds gRPC serving support via a new grpc flag for high performance RPC based serving alongside the existing HTTP API. The recommended install command now uses 'uv pip install vllm torch backend=auto' for automatic PyTorch backend selection. Update your scripts and Docker images before upgrading.