We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 8dbb656 commit 2b8fe03Copy full SHA for 2b8fe03
protobuf/model_config.proto
@@ -1662,7 +1662,7 @@ message ModelEnsembling
1662
1663
//@@ .. cpp:var:: uint32 max_inflight_requests
1664
//@@
1665
- //@@ The maximum number of concurrent inflight requests at each ensemble
+ //@@ The maximum number of concurrent inflight requests at each ensemble
1666
//@@ step.
1667
//@@ This limit prevents unbounded memory growth when decoupled models
1668
//@@ produce responses faster than downstream models can consume them.
0 commit comments