[Feature] Fix counter release logic & update go-router download URL (#6280)

* [Doc] Update prerequisites in the documentation * [Feature] Enhance Router with /v1/completions, docs, scripts, and version info * [Feature] Enhance Router with /v1/completions, docs, scripts, and version info * [Feature] Enhance Router with /v1/completions, docs, scripts, and version info * [Feature] Fix counter release logic * [Feature] Update go-router download URL * [Feature] Update go-router download URL * [Feature] Update go-router download URL * [Feature] Update go-router download URL * [Feature] Update token counter logic and docs * [Feature] Update token counter logic and docs --------- Co-authored-by: mouxin <mouxin@baidu.com>
2026-04-23 00:17:25 +08:00 · 2026-02-04 15:02:38 +08:00
parent 36547cfdb3
commit 6e96bd0bd2
16 changed files with 115 additions and 51 deletions
@@ -4,12 +4,20 @@

 FastDeploy provides a Golang-based [Router](https://github.com/PaddlePaddle/FastDeploy/tree/develop/fastdeploy/golang_router) for request scheduling. The Router supports both centralized deployment and Prefill/Decode (PD) disaggregated deployment.。

+![go-router](images/go-router-workflow.png)
+
 ## Installation

 ### 1. Prebuilt Binaries

 Starting from FastDeploy v2.5.0, the official Docker images include the Go language environment required to build the Golang Router and also provide a precompiled Router binary. The Router binary is located by default in the `/usr/local/bin` directory and can be used directly without additional compilation. For installation details, please refer to the [FastDeploy Installation Guide](../get_started/installation/nvidia_gpu.md)

+If you need to download the Golang-based router binary separately, it can be installed using the following steps:
+```
+wget https://paddle-qa.bj.bcebos.com/paddle-pipeline/FastDeploy_ActionCE/develop/latest/fd-router
+mv fd-router /usr/local/bin/fd-router
+```
+
 ### 2. Build from Source

 You need to build the Router from source in the following scenarios:
@@ -33,7 +41,7 @@ bash build.sh

 Start the Router service. The `--port` parameter specifies the scheduling port for centralized deployment.
 ```
-./fd-router --port 30000
+/usr/local/bin/fd-router --port 30000
 ```

 Start a mixed inference instance. Compared to standalone deployment, specify the Router endpoint via `--router`. Other parameters remain unchanged.
@@ -50,7 +58,7 @@ python -m fastdeploy.entrypoints.openai.api_server \

 Start the Router service with PD disaggregation enabled using the `--splitwise` flag.
 ```
-./fd-router \
+/usr/local/bin/fd-router \
  --port 30000 \
  --splitwise
 ```
@@ -105,7 +113,7 @@ popd

 Launch the Router with the custom configuration specified via `--config_path`:
 ```
-./fd-router \
+/usr/local/bin/fd-router \
  --port 30000 \
  --splitwise \
  --config_path examples/run_with_config/config/config.yaml