Examples

This section contains examples of using LWS with or without specific inference runtime.

vLLM

An example of using vLLM with LWS

TensorRT-LLM

An example of using TensorRT-LLM with LWS

llama.cpp

An example of using llama.cpp with LWS

SGLang

An example of using SGLang with LWS

Topology Aware Scheduling with Kueue

An example on using topology aware scheduling with LWS and Kueue, using vLLM

Feedback

Was this page helpful?