-
Notifications
You must be signed in to change notification settings - Fork 0
Add system info and health API #113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
heavy-d
wants to merge
12
commits into
main
Choose a base branch
from
add-system-api
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
2bd291e
Add system API endpoints for system information and health checks
heavy-d d1c534c
Refactor system API by consolidating endpoints and models
heavy-d 55bf155
Enhance system API with additional version and path information
heavy-d 021fe83
Enhance CUDA version retrieval in get_versions_info function
heavy-d d3b8d3c
Enhance health check functionality and directory management
heavy-d 4542918
Add main log file path to get_paths_info function
heavy-d 22dfc81
Enhance system API and improve user experience
heavy-d 70be7b1
Enhance health check and system stats functionality
heavy-d d7a3e60
Refactor get_paths_info function to use platform-specific template paths
heavy-d 6571fa8
Update caching mechanism for system stats and adjust TTL
heavy-d 9315525
Add GPU information retrieval and enhance health check path handling
heavy-d 14a23e7
Merge branch 'main' into add-system-api
heavy-d File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,7 +21,7 @@ | |
| from nodetool.metadata.types import Provider | ||
| from nodetool.packages.registry import get_nodetool_package_source_folders | ||
|
|
||
| from . import asset, job, message, node, storage, workflow, model, settings, thread | ||
| from . import asset, job, message, node, storage, workflow, model, settings, thread, system | ||
| import mimetypes | ||
|
|
||
| from nodetool.common.websocket_updates import websocket_updates | ||
|
|
@@ -103,6 +103,8 @@ def get_routers(cls) -> List[APIRouter]: | |
| DEFAULT_ROUTERS.append(file.router) | ||
| DEFAULT_ROUTERS.append(settings.router) | ||
| DEFAULT_ROUTERS.append(collection.router) | ||
| # System endpoints are only available in non-production | ||
| DEFAULT_ROUTERS.append(system.router) | ||
| DEFAULT_ROUTERS.append(package.router) | ||
|
|
||
|
|
||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,162 @@ | ||
| from typing import List, Literal, Optional, Any, Dict | ||
| from fastapi import APIRouter | ||
| from pydantic import BaseModel | ||
|
|
||
| from nodetool.common.system_stats import get_system_stats, SystemStats | ||
|
|
||
| from .info import get_os_info, get_versions_info, get_paths_info | ||
| from .health import run_health_checks | ||
|
|
||
|
|
||
| router = APIRouter(prefix="/api/system", tags=["system"]) | ||
|
|
||
|
|
||
| class OSInfo(BaseModel): | ||
| platform: str | ||
| release: str | ||
| arch: str | ||
|
|
||
|
|
||
| class VersionsInfo(BaseModel): | ||
| python: Optional[str] = None | ||
| nodetool_core: Optional[str] = None | ||
| nodetool_base: Optional[str] = None | ||
| cuda: Optional[str] = None | ||
| gpu_name: Optional[str] = None | ||
| vram_total_gb: Optional[str] = None | ||
| driver_version: Optional[str] = None | ||
|
|
||
|
|
||
| class PathsInfo(BaseModel): | ||
| settings_path: str | ||
| secrets_path: str | ||
| data_dir: str | ||
| core_logs_dir: str | ||
| core_log_file: str | ||
| ollama_models_dir: str | ||
| huggingface_cache_dir: str | ||
| electron_user_data: str | ||
| electron_log_file: str | ||
| electron_logs_dir: str | ||
| electron_main_log_file: str | ||
|
|
||
|
|
||
| class SystemInfoResponse(BaseModel): | ||
| os: OSInfo | ||
| versions: VersionsInfo | ||
| paths: PathsInfo | ||
|
|
||
|
|
||
| class HealthCheck(BaseModel): | ||
| id: str | ||
| status: Literal["ok", "warn", "error"] | ||
| details: Optional[str] = None | ||
| fix_hint: Optional[str] = None | ||
|
|
||
|
|
||
| class HealthSummary(BaseModel): | ||
| ok: int | ||
| warn: int | ||
| error: int | ||
|
|
||
|
|
||
| class HealthResponse(BaseModel): | ||
| checks: List[HealthCheck] | ||
| summary: HealthSummary | ||
|
|
||
|
|
||
| _CACHE: dict[str, tuple[float, dict]] = {} | ||
| _TTL_SECONDS = 30.0 | ||
|
|
||
|
|
||
| @router.get("/") | ||
| async def get_system_info() -> SystemInfoResponse: | ||
| import time | ||
|
|
||
| now = time.time() | ||
| cached = _CACHE.get("system_info") | ||
| if cached and (now - cached[0]) < _TTL_SECONDS: | ||
| payload = cached[1] | ||
| else: | ||
| os_info = get_os_info() | ||
| versions = get_versions_info() | ||
| paths = get_paths_info() | ||
| payload = { | ||
| "os": os_info, | ||
| "versions": versions, | ||
| "paths": paths, | ||
| } | ||
| _CACHE["system_info"] = (now, payload) | ||
|
|
||
| return SystemInfoResponse( | ||
| os=OSInfo(**payload["os"]), | ||
| versions=VersionsInfo(**payload["versions"]), | ||
| paths=PathsInfo(**payload["paths"]), | ||
| ) | ||
|
|
||
|
|
||
| @router.get("/health") | ||
| async def get_system_health() -> HealthResponse: | ||
| try: | ||
| result: Dict[str, Any] = run_health_checks() | ||
|
|
||
| # Validate the structure of the result | ||
| if not isinstance(result, dict): | ||
| raise ValueError("Health check result must be a dictionary") | ||
|
|
||
| checks_list: List[Dict[str, Any]] = result.get("checks", []) or [] | ||
| if not isinstance(checks_list, list): | ||
| raise ValueError("Health check 'checks' must be a list") | ||
|
|
||
| # Validate each check has required fields | ||
| validated_checks = [] | ||
| for check in checks_list: | ||
| if not isinstance(check, dict): | ||
| continue # Skip invalid checks | ||
| if "id" not in check or "status" not in check: | ||
| continue # Skip checks missing required fields | ||
| validated_checks.append(check) | ||
|
|
||
| checks_models = [HealthCheck(**c) for c in validated_checks] | ||
|
|
||
| summary_data = result.get("summary", {}) or {} | ||
| if not isinstance(summary_data, dict): | ||
| summary_data = {"ok": 0, "warn": 0, "error": 0} | ||
|
|
||
| # Ensure summary has required fields with defaults | ||
| summary_data.setdefault("ok", 0) | ||
| summary_data.setdefault("warn", 0) | ||
| summary_data.setdefault("error", 0) | ||
|
|
||
| summary = HealthSummary(**summary_data) | ||
| return HealthResponse(checks=checks_models, summary=summary) | ||
|
|
||
| except Exception as e: | ||
| # Return a safe fallback response if health checks fail | ||
| return HealthResponse( | ||
| checks=[HealthCheck( | ||
| id="health_check_error", | ||
| status="error", | ||
| details=f"Health check system error: {str(e)}", | ||
| fix_hint="Check system logs for more details" | ||
| )], | ||
| summary=HealthSummary(ok=0, warn=0, error=1) | ||
| ) | ||
|
|
||
|
|
||
| @router.get("/stats") | ||
| async def get_stats() -> SystemStats: | ||
| import time | ||
|
|
||
| now = time.time() | ||
| cached = _CACHE.get("system_stats") | ||
| if cached and (now - cached[0]) < _TTL_SECONDS: | ||
| return SystemStats(**cached[1]) | ||
| else: | ||
| stats = get_system_stats() | ||
| stats_dict = stats.model_dump() | ||
| _CACHE["system_stats"] = (now, stats_dict) | ||
| return stats | ||
|
|
||
|
|
||
|
|
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Copilot Autofix
AI 4 months ago
To fully mitigate the risk of path traversal and ensure that only paths within the allowed directories can be opened, we should:
safe_rootsare resolved to absolute paths before comparison.is_relative_tois not available (Python <3.9), but since the code uses it, we assume Python 3.9+./open_in_explorerendpoint, specifically in the block wheresafe_rootsare used and compared.No new imports are needed, but we should add a line to resolve each root directory before the comparison loop.