Skip to content

Commit 4181b44

Browse files
[serve] Allow RAY_SERVE_THROUGHPUT_OPTIMIZED flags to be overriden (#58057)
## Description This PR allows users to override individual flags set by `RAY_SERVE_THROUGHPUT_OPTIMIZED`. This improves the current UX of having to set each flag individually if any of the flags is different from what `RAY_SERVE_THROUGHPUT_OPTIMIZED` sets. ## Related issues ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: akyang-anyscale <[email protected]>
1 parent ac943b3 commit 4181b44

File tree

2 files changed

+11
-4
lines changed

2 files changed

+11
-4
lines changed

doc/source/serve/advanced-guides/performance.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,7 @@ You can also configure each option individually. The following table details the
152152
| `RAY_SERVE_REQUEST_PATH_LOG_BUFFER_SIZE=1000` | Sets the log buffer to batch writes to every `1000` logs, flushing the buffer on write. The system always flushes the buffer and writes logs when it detects a line with level ERROR. Set the buffer size to `1` to disable buffering and write logs immediately. |
153153
| `RAY_SERVE_LOG_TO_STDERR=0` | Only write logs to files under the `logs/serve/` directory. Proxy, Controller, and Replica logs no longer appear in the console, worker files, or the Actor Logs section of the Ray Dashboard. Set this property to `1` to enable additional logging. |
154154

155+
You may want to enable throughput-optimized serving while customizing the options above. You can do this by setting `RAY_SERVE_THROUGHPUT_OPTIMIZED=1` and overriding the specific options. For example, to enable throughput-optimized serving and continue logging to stderr, you should set `RAY_SERVE_THROUGHPUT_OPTIMIZED=1` and override with `RAY_SERVE_LOG_TO_STDERR=1`.
155156

156157
## Debugging performance issues in controller
157158

python/ray/serve/_private/constants.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -501,10 +501,16 @@
501501
# This should be at the end.
502502
RAY_SERVE_THROUGHPUT_OPTIMIZED = get_env_bool("RAY_SERVE_THROUGHPUT_OPTIMIZED", "0")
503503
if RAY_SERVE_THROUGHPUT_OPTIMIZED:
504-
RAY_SERVE_RUN_USER_CODE_IN_SEPARATE_THREAD = False
505-
RAY_SERVE_REQUEST_PATH_LOG_BUFFER_SIZE = 1000
506-
RAY_SERVE_RUN_ROUTER_IN_SEPARATE_LOOP = False
507-
RAY_SERVE_LOG_TO_STDERR = False
504+
RAY_SERVE_RUN_USER_CODE_IN_SEPARATE_THREAD = get_env_bool(
505+
"RAY_SERVE_RUN_USER_CODE_IN_SEPARATE_THREAD", "0"
506+
)
507+
RAY_SERVE_REQUEST_PATH_LOG_BUFFER_SIZE = get_env_int(
508+
"RAY_SERVE_REQUEST_PATH_LOG_BUFFER_SIZE", 1000
509+
)
510+
RAY_SERVE_RUN_ROUTER_IN_SEPARATE_LOOP = get_env_bool(
511+
"RAY_SERVE_RUN_ROUTER_IN_SEPARATE_LOOP", "0"
512+
)
513+
RAY_SERVE_LOG_TO_STDERR = get_env_bool("RAY_SERVE_LOG_TO_STDERR", "0")
508514

509515
# The maximum allowed RPC latency in milliseconds.
510516
# This is used to detect and warn about long RPC latencies

0 commit comments

Comments
 (0)