Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ Configuration options for the server are defined only via command-line options a
| `allowed_headers` | `string` (default: *) | Comma-separated list of allowed headers in CORS requests. |
| `allowed_methods` | `string` (default: *) | Comma-separated list of allowed methods in CORS requests. |
| `allowed_origins` | `string` (default: *) | Comma-separated list of allowed origins in CORS requests. |
| `api_key_file` | `string` | Path to the text file with the API key for generative endpoints `/v3/`. The value of first line is used. If not specified, server is using environment variable API_KEY. If not set, requests will not require authorization.|

## Config management mode options

Expand Down
15 changes: 8 additions & 7 deletions docs/security_considerations.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,9 @@
By default, the OpenVINO Model Server containers start with the security context of a local account `ovms` with Linux UID 5000. This ensures the Docker container does not have elevated permissions on the host machine. This is in line with best practices to use minimal permissions when running containerized applications. You can change the security context by adding the `--user` parameter to the Docker run command. This may be needed for loading mounted models with restricted access.
For additional security hardening, you might also consider preventing write operations on the container root filesystem by adding a `--read-only` flag. This prevents undesired modification of the container files. In case the cloud storage used for the model repository (S3, Google Storage, or Azure storage) is restricting the root filesystem, it should be combined with `--tmpfs /tmp` flag.

```bash
mkdir -p models/resnet/1
wget -P models/resnet/1 https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.bin
wget -P models/resnet/1 https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.xml

docker run --rm -d --user $(id -u):$(id -g) --read-only --tmpfs /tmp -v ${PWD}/models/:/models -p 9178:9178 openvino/model_server:latest \
--model_path /models/resnet/ --model_name resnet --port 9178
```
docker run --rm -d --user $(id -u):$(id -g) --read-only --tmpfs /tmp -p 9000:9000 openvino/model_server:latest \
--model_path s3://bucket/model --model_name model --port 9000

```
---
Expand All @@ -21,11 +17,16 @@ See also:
- [Securing OVMS with NGINX](../extras/nginx-mtls-auth/README.md)
- [Securing models with OVSA](https://docs.openvino.ai/2025/about-openvino/openvino-ecosystem/openvino-project/openvino-security-add-on.html)

---
Generative endpoints starting with `/v3`, might be restricted with authorization and API key. It can be set during the server initialization with a parameter `api_key_file` or environment variable `API_KEY`.
The `api_key_file` should contains a path to the file containing the value of API key. The content of the file first line is used. If parameter api_key_file and variable API_KEY are not set, the server will not require any authorization. The client should send the API key inside the `Authorization` header as `Bearer <api_key>`.

---

OpenVINO Model Server has a set of mechanisms preventing denial of service attacks from the client applications. They include the following:
- setting the number of inference execution streams which can limit the number of parallel inference calls in progress for each model. It can be tuned with `NUM_STREAMS` or `PERFORMANCE_HINT` plugin config.
- setting the maximum number of gRPC threads which is, by default, configured to the number 8 * number_of_cores. It can be changed with the parameter `--grpc_max_threads`.
- setting the maximum number of REST workers which is, be default, configured to the number 4 * number_of_cores. It can be changed with the parameter `--grpc_rest_workers`.
- maximum size of REST and GRPC message which is 1GB - bigger messages will be rejected
- setting max_concurrent_streams which defines how many concurrent threads can be initiated from a single client - the remaining will be queued. The default is equal to the number of CPU cores. It can be changed with the `--grpc_channel_arguments grpc.max_concurrent_streams=8`.
- setting the gRPC memory quota for the requests buffer - the default is 2GB. It can be changed with `--grpc_memory_quota=2147483648`. Value `0` invalidates the quota.
Expand Down
1 change: 1 addition & 0 deletions src/capi_frontend/server_settings.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,7 @@ struct ServerSettingsImpl {
std::string allowedOrigins{"*"};
std::string allowedMethods{"*"};
std::string allowedHeaders{"*"};
std::string apiKey;
#ifdef MTR_ENABLED
std::string tracePath;
#endif
Expand Down
33 changes: 32 additions & 1 deletion src/cli_parser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
//*****************************************************************************
#include "cli_parser.hpp"

#include <filesystem>
#include <iostream>
#include <stdexcept>
#include <string>
Expand All @@ -35,6 +36,7 @@
namespace ovms {

constexpr const char* CONFIG_MANAGEMENT_HELP_GROUP{"config management"};
constexpr const char* API_KEY_ENV_VAR{"API_KEY"};

std::string getConfigPath(const std::string& configPath) {
bool isDir = false;
Expand Down Expand Up @@ -160,7 +162,11 @@ void CLIParser::parse(int argc, char** argv) {
("allowed_headers",
"Comma separated list of headers that are allowed to access the API. Default: *.",
cxxopts::value<std::string>()->default_value("*"),
"ALLOWED_HEADERS");
"ALLOWED_HEADERS")
("api_key_file",
"path to the text file containing API key for authentication for generative endpoints. If not set, authentication is disabled.",
cxxopts::value<std::string>()->default_value(""),
"API_KEY");

options->add_options("multi model")
("config_path",
Expand Down Expand Up @@ -493,6 +499,31 @@ void CLIParser::prepareServer(ServerSettingsImpl& serverSettings) {
serverSettings.allowedOrigins = result->operator[]("allowed_origins").as<std::string>();
serverSettings.allowedMethods = result->operator[]("allowed_methods").as<std::string>();
serverSettings.allowedHeaders = result->operator[]("allowed_headers").as<std::string>();
std::filesystem::path api_key_file = result->operator[]("api_key_file").as<std::string>();
serverSettings.apiKey = "";
if (!api_key_file.empty()) {
std::ifstream file(api_key_file);
if (file.is_open()) {
std::getline(file, serverSettings.apiKey);
// Use first line and trim whitespace characters from both ends
size_t endpos = serverSettings.apiKey.find_last_not_of(" \n\r\t");
if (endpos != std::string::npos) {
serverSettings.apiKey = serverSettings.apiKey.substr(0, endpos + 1);
}
file.close();
} else {
std::cerr << "Error reading API key file: Unable to open file " << api_key_file << std::endl;
exit(OVMS_EX_USAGE);
}
} else {
const char* envApiKey = std::getenv(API_KEY_ENV_VAR);
if (envApiKey != nullptr) {
serverSettings.apiKey = envApiKey;
}
if (serverSettings.apiKey.empty()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we intent to print this message in both cases (empty file with api key & empty env)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case both file name is empty and env is empty

std::cerr << "Warning: API key not provided via --api_key_file or API_KEY environment variable. Authentication will be disabled." << std::endl;
}
}
}

void CLIParser::prepareModel(ModelsSettingsImpl& modelsSettings, HFSettingsImpl& hfSettings) {
Expand Down
1 change: 1 addition & 0 deletions src/config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -361,5 +361,6 @@ const std::string& Config::allowedOrigins() const { return this->serverSettings.
const std::string& Config::allowedMethods() const { return this->serverSettings.allowedMethods; }
const std::string& Config::allowedHeaders() const { return this->serverSettings.allowedHeaders; }
const std::string Config::cacheDir() const { return this->serverSettings.cacheDir; }
const std::string& Config::apiKey() const { return this->serverSettings.apiKey; }

} // namespace ovms
1 change: 1 addition & 0 deletions src/config.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -323,6 +323,7 @@ class Config {
const std::string& allowedOrigins() const;
const std::string& allowedMethods() const;
const std::string& allowedHeaders() const;
const std::string& apiKey() const;

/**
* @brief Model cache directory
Expand Down
40 changes: 37 additions & 3 deletions src/http_rest_api_handler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
//*****************************************************************************
#include "http_rest_api_handler.hpp"

#include <algorithm>
#include <cctype>
#include <iomanip>
#include <memory>
Expand Down Expand Up @@ -124,6 +125,7 @@ const std::string HttpRestApiHandler::v3_RegexExp =
const std::string HttpRestApiHandler::metricsRegexExp = R"((.?)\/metrics(\?(.*))?)";

HttpRestApiHandler::HttpRestApiHandler(ovms::Server& ovmsServer, int timeout_in_ms) :
apiKey(ovmsServer.getAPIKey()),
predictionRegex(predictionRegexExp),
modelstatusRegex(modelstatusRegexExp),
configReloadRegex(configReloadRegexExp),
Expand Down Expand Up @@ -221,7 +223,7 @@ void HttpRestApiHandler::registerAll() {
});
registerHandler(V3, [this](const std::string_view uri, const HttpRequestComponents& request_components, std::string& response, const std::string& request_body, HttpResponseComponents& response_components, std::shared_ptr<HttpAsyncWriter> serverReaderWriter, std::shared_ptr<MultiPartParser> multiPartParser) -> Status {
OVMS_PROFILE_FUNCTION();
return processV3(uri, request_components, response, request_body, std::move(serverReaderWriter), std::move(multiPartParser));
return processV3(uri, request_components, response, request_body, std::move(serverReaderWriter), std::move(multiPartParser), apiKey);
});
registerHandler(Metrics, [this](const std::string_view uri, const HttpRequestComponents& request_components, std::string& response, const std::string& request_body, HttpResponseComponents& response_components, std::shared_ptr<HttpAsyncWriter> serverReaderWriter, std::shared_ptr<MultiPartParser> multiPartParser) -> Status {
return processMetrics(request_components, response, request_body);
Expand Down Expand Up @@ -668,14 +670,46 @@ Status HttpRestApiHandler::processListModelsRequest(std::string& response) {
return StatusCode::OK;
}

Status HttpRestApiHandler::processV3(const std::string_view uri, const HttpRequestComponents& request_components, std::string& response, const std::string& request_body, std::shared_ptr<HttpAsyncWriter> serverReaderWriter, std::shared_ptr<MultiPartParser> multiPartParser) {
std::unordered_map<std::string, std::string> HttpRestApiHandler::toLowerCaseHeaders(const std::unordered_map<std::string, std::string>& headers) {
std::unordered_map<std::string, std::string> lowercaseHeaders;
for (const auto& [key, value] : headers) {
std::string lowercaseKey = key;
std::transform(lowercaseKey.begin(), lowercaseKey.end(), lowercaseKey.begin(),
[](unsigned char c) { return std::tolower(c); });
lowercaseHeaders[lowercaseKey] = value;
}
return lowercaseHeaders;
}

Status HttpRestApiHandler::checkIfAuthorized(const std::unordered_map<std::string, std::string>& headers, const std::string& apiKey) {
if (!apiKey.empty()) {
auto lowercaseHeaders = toLowerCaseHeaders(headers);
if (lowercaseHeaders.count("authorization")) {
if (lowercaseHeaders.at("authorization") != "Bearer " + apiKey) {
SPDLOG_DEBUG("Unauthorized request - invalid API key provided.");
return StatusCode::UNAUTHORIZED;
}
} else {
SPDLOG_DEBUG("Unauthorized request - missing API key");
return StatusCode::UNAUTHORIZED;
}
}
return StatusCode::OK;
}

Status HttpRestApiHandler::processV3(const std::string_view uri, const HttpRequestComponents& request_components, std::string& response, const std::string& request_body, std::shared_ptr<HttpAsyncWriter> serverReaderWriter, std::shared_ptr<MultiPartParser> multiPartParser, const std::string& apiKey) {
#if (MEDIAPIPE_DISABLE == 0)
OVMS_PROFILE_FUNCTION();

HttpPayload request;
std::string modelName;
bool streamFieldVal = false;

// convert headers to lowercase because http headers are case insensitive
std::unordered_map<std::string, std::string> lowercaseHeaders;
Status authStatus = checkIfAuthorized(request_components.headers, apiKey);
if (!authStatus.ok()) {
return authStatus;
}
auto status = createV3HttpPayload(uri, request_components, response, request_body, serverReaderWriter, std::move(multiPartParser), request, modelName, streamFieldVal);
if (!status.ok()) {
SPDLOG_DEBUG("Failed to create V3 payload: {}", status.string());
Expand Down
5 changes: 4 additions & 1 deletion src/http_rest_api_handler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -238,9 +238,12 @@ class HttpRestApiHandler {
Status processServerLiveKFSRequest(const HttpRequestComponents& request_components, std::string& response, const std::string& request_body);
Status processServerMetadataKFSRequest(const HttpRequestComponents& request_components, std::string& response, const std::string& request_body);

Status processV3(const std::string_view uri, const HttpRequestComponents& request_components, std::string& response, const std::string& request_body, std::shared_ptr<HttpAsyncWriter> serverReaderWriter, std::shared_ptr<MultiPartParser> multiPartParser);
Status processV3(const std::string_view uri, const HttpRequestComponents& request_components, std::string& response, const std::string& request_body, std::shared_ptr<HttpAsyncWriter> serverReaderWriter, std::shared_ptr<MultiPartParser> multiPartParser, const std::string& apiKey);
Status processListModelsRequest(std::string& response);
Status processRetrieveModelRequest(const std::string& name, std::string& response);
std::unordered_map<std::string, std::string> toLowerCaseHeaders(const std::unordered_map<std::string, std::string>& headers);
Status checkIfAuthorized(const std::unordered_map<std::string, std::string>& headers, const std::string& apiKey);
const std::string apiKey;

private:
const std::regex predictionRegex;
Expand Down
6 changes: 6 additions & 0 deletions src/server.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ static void logConfig(const Config& config) {
SPDLOG_DEBUG("gRPC channel arguments: {}", config.grpcChannelArguments());
SPDLOG_DEBUG("log level: {}", config.logLevel());
SPDLOG_DEBUG("log path: {}", config.logPath());
SPDLOG_TRACE("API key: {}", config.getServerSettings().apiKey);
SPDLOG_DEBUG("file system poll wait milliseconds: {}", config.filesystemPollWaitMilliseconds());
SPDLOG_DEBUG("sequence cleaner poll wait minutes: {}", config.sequenceCleanerPollWaitMinutes());
SPDLOG_DEBUG("model_repository_path: {}", config.getServerSettings().hfSettings.downloadPath);
Expand Down Expand Up @@ -302,6 +303,7 @@ Status Server::startModules(ovms::Config& config) {
// that's why we delay starting the servable until the very end while we need to create it before
// GRPC & REST
Status status;
apiKey = config.apiKey();
bool inserted = false;
auto it = modules.end();
if (config.getServerSettings().serverMode == UNKNOWN_MODE) {
Expand Down Expand Up @@ -371,6 +373,10 @@ void Server::ensureModuleShutdown(const std::string& name) {
it->second->shutdown();
}

std::string Server::getAPIKey() const {
return apiKey;
}

class ModulesShutdownGuard {
Server& server;

Expand Down
2 changes: 2 additions & 0 deletions src/server.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,10 @@ class Server {
virtual ~Server();
Status startModules(ovms::Config& config);
void shutdownModules();
std::string getAPIKey() const;

private:
void ensureModuleShutdown(const std::string& name);
std::string apiKey;
};
} // namespace ovms
1 change: 1 addition & 0 deletions src/status.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@ const std::unordered_map<StatusCode, std::string> Status::statusMessageMap = {
{StatusCode::UNKNOWN_REQUEST_COMPONENTS_TYPE, "Request components type not recognized"},
{StatusCode::FAILED_TO_PARSE_MULTIPART_CONTENT_TYPE, "Request of multipart type but failed to parse"},
{StatusCode::FAILED_TO_DEDUCE_MODEL_NAME_FROM_URI, "Failed to deduce model name from all possible ways"},
{StatusCode::UNAUTHORIZED, "Unauthorized request due to invalid api-key"},

// Rest parser failure
{StatusCode::REST_BODY_IS_NOT_AN_OBJECT, "Request body should be JSON object"},
Expand Down
1 change: 1 addition & 0 deletions src/status.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,7 @@ enum class StatusCode {
UNKNOWN_REQUEST_COMPONENTS_TYPE, /*!< Components type not recognized */
FAILED_TO_PARSE_MULTIPART_CONTENT_TYPE, /*!< Request of multipart type but failed to parse */
FAILED_TO_DEDUCE_MODEL_NAME_FROM_URI, /*!< Failed to deduce model name from all possible ways */
UNAUTHORIZED, /*!< Unauthorized request due to invalid api-key*/

// REST Parse
REST_BODY_IS_NOT_AN_OBJECT, /*!< REST body should be JSON object */
Expand Down
Loading