Files
FastDeploy/docs/zh/best_practices/DeepSeek-V3.md
T
AIbin 47bfd45bb6 [Docs]add deepseek model doc (#6513)
* add deepseek model doc
2026-02-26 14:08:19 +08:00

40 lines
985 B
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
[English](../../best_practices/DeepSeek-V3-V3.1.md)
# DeepSeek-V3/V3.1 模型
## 一、环境准备
### 1.1 支持情况
DeepSeek-V3/V3.1 各量化精度,在下列硬件上部署所需要的最小卡数如下:
| | WINT4 |
|-----|-----|
|H800 80GB| 8 |
### 1.2 安装fastdeploy
安装流程参考文档 [FastDeploy GPU 安装](../get_started/installation/nvidia_gpu.md)
## 二、如何使用
### 2.1 基础:启动服务
**示例1** H800上八卡部署wint4模型16K上下文的服务
```shell
MODEL_PATH=/models/DeepSeek-V3.2-Exp-BF16
export FD_DISABLE_CHUNKED_PREFILL=1
export FD_ATTENTION_BACKEND="MLA_ATTN"
export FLAGS_flash_attn_version=3
python -m fastdeploy.entrypoints.openai.api_server \
--model "$MODEL_PATH" \
--port 8180 \
--metrics-port 8181 \
--engine-worker-queue-port 8182 \
--cache-queue-port 8183 \
--tensor-parallel-size 8 \
--max-model-len 16384 \
--max-num-seq 100 \
--no-enable-prefix-caching \
--quantization wint4
```