This is an API Server implemented in Golang for receiving and processing BSS (BPF Scheduler Subsystem) metrics data and providing system information.
Click the image below to see our DEMO on YouTube!
- Receive BSS metrics data sent by clients
- Query pod-to-PID mappings from the system
- Provide RESTful API in JSON format
- Include health check endpoint
- Support CORS
- Request logging capability
- Error handling and validation
- Kubernetes integration for pod label information
- Configurable scheduling strategies based on pod labels
- URL:
/api/v1/metrics - Method:
POST - Content-Type:
application/json
{
"usersched_pid": 1234,
"nr_queued": 10,
"nr_scheduled": 5,
"nr_running": 2,
"nr_online_cpus": 8,
"nr_user_dispatches": 100,
"nr_kernel_dispatches": 50,
"nr_cancel_dispatches": 2,
"nr_bounce_dispatches": 1,
"nr_failed_dispatches": 0,
"nr_sched_congested": 3
}{
"success": true,
"message": "Metrics received successfully",
"timestamp": "2025-06-19T10:30:00Z"
}{
"success": false,
"error": "Invalid JSON format: ..."
}- URL:
/api/v1/pods/pids - Method:
GET
{
"success": true,
"message": "Pod-PID mappings retrieved successfully",
"timestamp": "2025-06-25T13:50:21Z",
"pods": [
{
"pod_name": "",
"namespace": "",
"pod_uid": "65979e01-4cb1-4d08-9dba-45530253ff00",
"container_id": "5148a146ffbbe8672f11494843d54b8769d2eccc677c02027fc09aba192e3c67",
"processes": [
{
"pid": 717720,
"command": "pause",
"ppid": 717576
},
{
"pid": 718001,
"command": "loki",
"ppid": 717576
}
]
}
]
}{
"success": false,
"error": "Failed to get pod-pid mappings: ..."
}- URL:
/api/v1/scheduling/strategies - Methods:
GET,POST - Content-Type:
application/json
{
"success": true,
"message": "Scheduling strategies retrieved successfully",
"timestamp": "2025-06-19T10:30:00Z",
"scheduling": [
{
"priority": true,
"execution_time": 20000000,
"pid": 718001
},
{
"priority": false,
"execution_time": 10000000,
"pid": 717720
}
]
}{
"strategies": [
{
"priority": true,
"execution_time": 20000000,
"selectors": [
{
"key": "nf",
"value": "upf"
}
]
},
{
"priority": false,
"execution_time": 10000000,
"pid": 717720
}
]
}{
"success": true,
"message": "Scheduling strategies saved successfully",
"timestamp": "2025-06-19T10:30:00Z"
}- URL:
/health - Method:
GET
{
"status": "healthy",
"timestamp": "2025-06-19T10:30:00Z",
"service": "BSS Metrics API Server"
}- URL:
/ - Method:
GET
go mod tidygo run main.goThe service will start on http://localhost:8080.
# Use local Kubernetes config
go run main.go --kubeconfig=$HOME/.kube/config
# Run in-cluster (when deployed in Kubernetes)
go run main.go --in-cluster=truecurl -X POST http://localhost:8080/api/v1/metrics \
-H "Content-Type: application/json" \
-d '{
"usersched_pid": 1234,
"nr_queued": 10,
"nr_scheduled": 5,
"nr_running": 2,
"nr_online_cpus": 8,
"nr_user_dispatches": 100,
"nr_kernel_dispatches": 50,
"nr_cancel_dispatches": 2,
"nr_bounce_dispatches": 1,
"nr_failed_dispatches": 0,
"nr_sched_congested": 3
}'curl http://localhost:8080/health# Get all pod-pid mappings
curl -X GET http://localhost:8080/api/v1/pods/pids
# Format output with jq for better readability
curl -s -X GET http://localhost:8080/api/v1/pods/pids | jq '.'
# Get only specific information (example: extract pod UIDs and process counts)
curl -s -X GET http://localhost:8080/api/v1/pods/pids | jq '.pods[] | {pod_uid: .pod_uid, process_count: (.processes | length)}'curl -X GET http://localhost:8080/api/v1/scheduling/strategiescurl -X POST http://localhost:8080/api/v1/scheduling/strategies \
-H "Content-Type: application/json" \
-d '{
"strategies": [
{
"priority": true,
"execution_time": 20000000,
"selectors": [
{
"key": "nf",
"value": "upf"
}
]
},
{
"priority": false,
"execution_time": 10000000,
"pid": 717720
}
]
}'| Field | Type | Description |
|---|---|---|
usersched_pid |
uint32 | PID of the userspace scheduler |
nr_queued |
uint64 | Number of tasks queued in the userspace scheduler |
nr_scheduled |
uint64 | Number of tasks scheduled by the userspace scheduler |
nr_running |
uint64 | Number of tasks currently running in the userspace scheduler |
nr_online_cpus |
uint64 | Number of online CPUs in the system |
nr_user_dispatches |
uint64 | Number of userspace dispatches |
nr_kernel_dispatches |
uint64 | Number of kernel space dispatches |
nr_cancel_dispatches |
uint64 | Number of cancelled dispatches |
nr_bounce_dispatches |
uint64 | Number of bounce dispatches |
nr_failed_dispatches |
uint64 | Number of failed dispatches |
nr_sched_congested |
uint64 | Number of scheduler congestion occurrences |
| Field | Type | Description |
|---|---|---|
pod_name |
string | Name of the pod (currently empty, extracted from metadata) |
namespace |
string | Namespace of the pod (currently empty, extracted from metadata) |
pod_uid |
string | Unique identifier of the pod |
container_id |
string | Container ID within the pod |
processes |
[]PodProcess | List of processes running in the pod |
| Field | Type | Description |
|---|---|---|
pid |
int | Process ID |
command |
string | Command name of the process |
ppid |
int | Parent Process ID (optional) |
| Field | Type | Description |
|---|---|---|
priority |
bool | Indicates if this is a high-priority strategy |
execution_time |
uint64 | Desired execution time in nanoseconds |
pid |
uint32 | Optional specific PID to apply this strategy |
selectors |
[]LabelSelector | Optional selectors to match pods for this strategy |
| Field | Type | Description |
|---|---|---|
key |
string | Label key to match |
value |
string | Expected value of the label |
- Data Persistence: Store received metrics to database (such as PostgreSQL, MongoDB)
- Data Analytics: Add statistical and analytical features
- Alert System: Set up alert rules based on metrics values
- Authentication & Authorization: Add API key or JWT authentication
- Batch Processing: Support batch submission of multiple metrics data
- Monitoring Dashboard: Build web interface to visualize metrics data
- Uses Gorilla Mux for routing
- Includes middleware for CORS and logging
- Structured error handling and response format
- Timestamps use RFC3339 format
- Kubernetes client integration with caching
- Support for both in-cluster and out-of-cluster operation
此 API 伺服器可以與 Kubernetes 整合,以獲取真實的 Pod 標籤資訊。它支援兩種運行模式:
當在 Kubernetes 集群內部署時,API 伺服器會自動使用 ServiceAccount 連接到 Kubernetes API。完整的部署清單可在 k8s/deployment.yaml 中找到,其中包含:
- 具備必要權限的 ServiceAccount
- 具有健康檢查的 Deployment
- 用於公開 API 的 Service
部署命令:
kubectl apply -f k8s/deployment.yaml當在 Kubernetes 集群外運行時,API 伺服器會嘗試使用 kubeconfig 文件連接到 Kubernetes API。預設情況下,它會使用 ~/.kube/config 路徑,您也可以通過設置 KUBECONFIG 環境變數或使用 --kubeconfig 參數來指定不同的路徑:
export KUBECONFIG=/path/to/your/kubeconfig
go run main.go
# 或者
go run main.go --kubeconfig=/path/to/your/kubeconfigAPI 伺服器會定期刷新其 Pod 標籤緩存,以確保即使在 Pod 標籤變更時,調度策略也能正確應用。
如果無法連接到 Kubernetes API,系統會自動降級使用模擬數據。這對於開發和測試環境很有用。
要建立 Docker 映像:
docker build -t bss-metrics-api:latest .此專案使用 GitHub Actions 自動建置和釋出容器映像到 GitHub Container Registry (ghcr.io)。
自動釋出觸發條件:
- 當推送到
main分支時 - 當建立新的版本標籤時(例如:
v1.0.0)
映像標籤規則:
main分支的推送將標記為main和main-<sha>- 版本標籤將標記為
v1.0.0、1.0、1等
使用已釋出的映像:
# 使用最新的 main 版本
docker pull ghcr.io/gthulhu/api:main
# 使用特定版本
docker pull ghcr.io/gthulhu/api:v1.0.0This project is open source.
