[Docs] Add docs for disaggregated deployment (#6700)

* add docs for disaggregated deployment * pre-commit run for style check * update docs
2026-04-23 00:17:25 +08:00 · 2026-04-01 19:27:09 +08:00
parent ceaf5df350
commit 3b564116d5
6 changed files with 513 additions and 0 deletions
@@ -1,5 +1,7 @@
 [简体中文](../zh/features/disaggregated.md)

+[Best Practice](../best_practices/Disaggregated.md)
+
 # Disaggregated Deployment

 Large Language Model (LLM) inference is divided into two phases: **Prefill** and **Decode**, which are compute-intensive and memory-bound, respectively.