Skip to content

Lab 3: 基于vLLM的prefill/decode调度策略实现

Introduction

In this lab, you will implement prefill/decode scheduling strategies in vLLM.

Objectives

  • Understand vLLM architecture.
  • Implement custom scheduling policy.
  • Improve throughput and reduce latency.

Tasks

(Coming soon)

Released under CC BY-NC-SA 4.0 License.