[Docs] Add docs for disaggregated deployment (#6700)

* add docs for disaggregated deployment

* pre-commit run for style check

* update docs
This commit is contained in:
Jingfeng Wu
2026-04-01 19:27:09 +08:00
committed by GitHub
parent ceaf5df350
commit 3b564116d5
6 changed files with 513 additions and 0 deletions
+2
View File
@@ -1,5 +1,7 @@
[简体中文](../zh/features/disaggregated.md)
[Best Practice](../best_practices/Disaggregated.md)
# Disaggregated Deployment
Large Language Model (LLM) inference is divided into two phases: **Prefill** and **Decode**, which are compute-intensive and memory-bound, respectively.