mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2026-04-23 00:17:25 +08:00
[Docx] add language (en/cn) switch links (#4470)
* add install docs * 修改文档 * 修改文档
This commit is contained in:
@@ -1,3 +1,5 @@
|
||||
[English](../../features/disaggregated.md)
|
||||
|
||||
# 分离式部署
|
||||
|
||||
大模型推理分为两个部分Prefill和Decode阶段,分别为计算密集型(Prefill)和存储密集型(Decode)两部分。将Prefill 和 Decode 分开部署在一定场景下可以提高硬件利用率,有效提高吞吐,降低整句时延,
|
||||
|
||||
Reference in New Issue
Block a user