Files
FastDeploy/examples/cache_storage/README.md
T
jc e9b25aa72f [BugFix] Storage backend gets env params (#5892)
* Storage backend gets env params

* up

* up

* up
2026-01-06 14:14:17 +08:00

58 lines
1.5 KiB
Markdown

# MooncakeStore for FastDeploy
This document describes how to use MooncakeStore as the backend of FastDeploy.
## Preparation
### Install FastDeploy
Refer to [NVIDIA CUDA GPU Installation](https://paddlepaddle.github.io/FastDeploy/get_started/installation/nvidia_gpu/) for Fastdeploy installation.
### Install MooncakeStore
```bash
pip install mooncake-transfer-engine
```
## Run Examples
The example script is provided in `run.sh`. You can run it directly:
```
bash run.sh
```
In the example script, we will start a Mooncake master server and two FastDeploy server.
Launch Mooncake master server:
```bash
mooncake_master \
--port=15001 \
--enable_http_metadata_server=true \
--http_metadata_server_host=0.0.0.0 \
--http_metadata_server_port=15002 \
--metrics_port=15003 \
```
More parameter can be found in the [official guide](https://github.com/kvcache-ai/Mooncake/blob/main/docs/source/python-api-reference/transfer-engine.md).
Launch the Fastdeploy with Mooncake enabled.
```bash
export MOONCAKE_CONFIG_PATH="./mooncake_config.json"
python -m fastdeploy.entrypoints.openai.api_server \
--model ${MODEL_NAME} \
--port ${PORT} \
--metrics-port $((PORT + 1)) \
--engine-worker-queue-port $((PORT + 2)) \
--cache-queue-port $((PORT + 3)) \
--max-model-len 32768 \
--max-num-seqs 32 \
--kvcache-storage-backend mooncake
```
## Troubleshooting
For more details, please refer to:
https://github.com/kvcache-ai/Mooncake/blob/main/docs/source/troubleshooting/troubleshooting.md