Skip to content

Conversation

@zhravan
Copy link
Collaborator

@zhravan zhravan commented Oct 16, 2025

Issue

Link to related issue(s):


Description

This PR introduces REST API integration for LXD container lifecycle management. The implementation provides HTTP endpoints for comprehensive container operations including creation, listing, retrieval, starting, stopping, restarting, and deletion of LXD container instances.

Key additions:

  • Complete LXD container lifecycle management via REST API
  • Configuration options for LXD integration (socket path, project selection, operation timeout)
  • Go toolchain upgrade to version 1.25.3
  • Updated dependencies including gRPC, cryptography, and networking libraries

Scope of Change

Select all applicable areas impacted by this PR:

  • API
  • Infra / Deployment
  • View (UI/UX)
  • CLI
  • Docs
  • Other (specify): ________

Screenshot / Video / GIF (if applicable)

Attach or embed screenshots, screen recordings, or GIFs demonstrating the feature or fix.


Related PRs (if any)

Link any related or dependent PRs across repos.


Additional Notes for Reviewers (optional)

Anything reviewers should know before testing or merging (e.g., environment variables, setup steps).


Developer Checklist

To be completed by the developer who raised the PR.

  • Add valid/relevant title for the PR
  • Self-review done
  • Manual dev testing done
  • No secrets exposed
  • No merge conflicts
  • Docs added/updated (if applicable)
  • Removed debug prints / secrets / sensitive data
  • Unit / Integration tests passing
  • Follows all standards defined in Nixopus Docs

Reviewer Checklist

To be completed by the reviewer before merge.

  • Peer review done
  • No console.logs / fmt.prints left
  • No secrets exposed
  • If any DB migrations, migration changes are verified
  • Verified release changes are production-ready

Summary by CodeRabbit

  • New Features

    • LXD container lifecycle management via HTTP endpoints (create, list, get, start, stop, restart, delete, bulk delete).
    • Runtime-configurable LXD settings (enable, socket path, project, operation timeout).
    • Installer CLI flag to optionally include LXD; installer can perform best-effort LXD setup on Linux.
  • Chores

    • Go toolchain and dependency updates for stability and compatibility.
    • Compose and env defaults updated to expose/configure the LXD socket.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 16, 2025

Walkthrough

Adds LXD support: configuration bindings, a ClientService that talks to the local LXD daemon via unix socket, an HTTP controller with endpoints, route wiring guarded by runtime config, compose mounts for the LXD socket, CLI install option, and multiple dependency bumps.

Changes

Cohort / File(s) Summary
Configuration & Types
api/internal/types/types.go, api/internal/config/config.go
Add LXDConfig and LXD field on Config; bind environment variables (lxd.enabled, lxd.socket_path, lxd.project, lxd.operation_timeout_seconds) and validate/log defaults for socket path and operation timeout.
LXD Service Layer
api/internal/features/lxd/service/service.go
New Service interface and ClientService communicating with LXD via unix socket; implements Create/List/Get/Start/Stop/Restart/Delete/DeleteAll, per-operation timeouts, waitOp polling, device mapping, and project scoping; defaults socket path and timeout when omitted.
LXD HTTP Controller
api/internal/features/lxd/controller/controller.go
New Controller with endpoints wrapping service methods, per-request context timeouts, request body parsing, standardized JSON responses, and HTTP error mapping (400/404/500).
Routing & Integration
api/internal/routes.go
Add LXDRoutes and wire routes into Router.Routes() when config LXD enabled; instantiate service/controller lazily and apply RBAC/Audit middleware.
Docker Compose
docker-compose.yml, docker-compose-staging.yml
Expose host LXD unix socket via volume mount /var/snap/lxd/common/lxd/unix.socket into API services.
Dev / Prod Config
helpers/config.dev.yaml, helpers/config.prod.yaml
Add environment variables: LXD_ENABLED, LXD_SOCKET_PATH, LXD_PROJECT, LXD_OPERATION_TIMEOUT_SECONDS with defaults; dev file contains TODO comment for installer.
CLI Installer
cli/app/commands/install/command.py, cli/app/commands/install/run.py
New include_lxd CLI option, passed to Install; Install implements optional LXD installation step (_install_lxd_if_enabled), updates env vars when enabled, supports dry-run and logs outcomes.
Dependencies / Tooling
api/go.mod
Bump Go toolchain (1.25→1.25.3); add github.com/canonical/lxd and other direct/indirect dependency version upgrades across opentelemetry, grpc, protobuf, jwt, golang.org/x/*, and related modules.
Packaging
cli/pyproject.toml
CLI version bumped 0.1.11 → 0.1.12.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Controller
    participant Service
    participant LXD as "LXD Daemon (unix socket)"
    
    Client->>Controller: HTTP request (e.g., POST /lxd)
    activate Controller
    Controller->>Controller: Parse body, validate
    Controller->>Controller: context.WithTimeout (per-action)
    Controller->>Service: Call method (ctx, params)
    deactivate Controller

    activate Service
    Service->>LXD: LXD API call over unix socket (Create / Action)
    activate LXD
    LXD-->>Service: Operation ID / initial response
    deactivate LXD

    Service->>Service: waitOp loop (poll operation)
    loop poll until complete or ctx timeout
      Service->>LXD: Get operation status
      LXD-->>Service: Status update
    end
    Service-->>Controller: result or error
    deactivate Service

    activate Controller
    Controller->>Client: JSON response (Status, Data/Message) with HTTP code
    deactivate Controller
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 In burrows bright I softly hop,
LXD sockets bring containers to hop,
Services hum and routes align,
Timeouts tick, operations shine,
A rabbit cheers: "Containers online!" 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title concisely and clearly summarizes the primary change—adding LXD container lifecycle management via a REST API integration—without extraneous details or noise.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@zhravan
Copy link
Collaborator Author

zhravan commented Oct 16, 2025

@coderabbitai: Add summary and title please

@coderabbitai coderabbitai bot changed the title feat: lxd intg as rest feat: Add LXD Container Management REST API Integration Oct 16, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 16, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@zhravan, I'll help you add a better title and summary to the PR.


🏁 Script executed:

#!/bin/bash

# Craft a better PR title and description
PR_TITLE="feat: Add LXD Container Management REST API Integration"

PR_BODY="#### Issue
_Link to related issue(s):_  

---

#### Description
This PR introduces REST API integration for LXD container lifecycle management. The implementation provides HTTP endpoints for comprehensive container operations including creation, listing, retrieval, starting, stopping, restarting, and deletion of LXD container instances.

Key additions:
- Complete LXD container lifecycle management via REST API
- Configuration options for LXD integration (socket path, project selection, operation timeout)
- Go toolchain upgrade to version 1.25.3
- Updated dependencies including gRPC, cryptography, and networking libraries

---

#### Scope of Change
_Select all applicable areas impacted by this PR:_

- [x] API
- [x] Infra / Deployment
- [ ] View (UI/UX)
- [ ] CLI
- [ ] Docs
- [ ] Other (specify): ________

---

#### Screenshot / Video / GIF (if applicable)
_Attach or embed screenshots, screen recordings, or GIFs demonstrating the feature or fix._

---

#### Related PRs (if any)
_Link any related or dependent PRs across repos._

---

#### Additional Notes for Reviewers (optional)
_Anything reviewers should know before testing or merging (e.g., environment variables, setup steps)._

---

#### Developer Checklist
_To be completed by the developer who raised the PR._

- [x] Add valid/relevant title for the PR
- [ ] Self-review done  
- [ ] Manual dev testing done  
- [ ] No secrets exposed  
- [ ] No merge conflicts  
- [ ] Docs added/updated (if applicable)  
- [ ] Removed debug prints / secrets / sensitive data  
- [ ] Unit / Integration tests passing  
- [ ] Follows all standards defined in **Nixopus Docs**

---

#### Reviewer Checklist
_To be completed by the reviewer before merge._

- [ ] Peer review done  
- [ ] No console.logs / fmt.prints left  
- [ ] No secrets exposed  
- [ ] If any DB migrations, migration changes are verified
- [ ] Verified release changes are production-ready



<!-- This is an auto-generated comment: release notes by coderabbit.ai -->

## Summary by CodeRabbit

* **New Features**
  * Added LXD container lifecycle management with new HTTP endpoints for creating, listing, retrieving, starting, stopping, restarting, and deleting container instances.
  * Added LXD integration configuration options including socket path, project selection, and operation timeout customization.

* **Chores**
  * Updated Go toolchain to version 1.25.3.
  * Updated multiple dependencies including gRPC, cryptography, and networking libraries for stability.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->"

# Update the PR title and body
gh pr edit 511 --title "$PR_TITLE" --body "$PR_BODY"

Length of output: 46


@zhravan, I've successfully updated the PR!

New Title:
feat: Add LXD Container Management REST API Integration

Summary Added:
The description now includes:

  • Comprehensive overview of the LXD container lifecycle management REST API
  • Key additions listing all major features and updates
  • Filled in the "Scope of Change" section (API and Infra/Deployment checked)
  • Marked the "Add valid/relevant title for the PR" checkbox as complete

The PR is now ready with a clear, professional title and detailed description that accurately reflects the changes introduced.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (15)
helpers/config.prod.yaml (1)

35-39: LXD env surface looks good; add code‑level defaults and note opt‑in socket mount.

Bindings match viper keys. To harden boot without helper configs, set viper defaults in code and document that enabling requires the socket bind.

api/internal/config/config.go (2)

195-199: Set Viper defaults for LXD to decouple from external YAML/env.

Provide sane defaults in code so the app boots predictably even without helper configs.

 	// LXD
+	viper.SetDefault("lxd.enabled", false)
+	viper.SetDefault("lxd.socket_path", "/var/snap/lxd/common/lxd/unix.socket")
+	viper.SetDefault("lxd.project", "default")
+	viper.SetDefault("lxd.operation_timeout_seconds", 60)
 	viper.BindEnv("lxd.enabled", "LXD_ENABLED")
 	viper.BindEnv("lxd.socket_path", "LXD_SOCKET_PATH")
 	viper.BindEnv("lxd.project", "LXD_PROJECT")
 	viper.BindEnv("lxd.operation_timeout_seconds", "LXD_OPERATION_TIMEOUT_SECONDS")

260-268: Add a socket existence check when LXD is enabled.

Warn early if the configured socket path is missing to aid troubleshooting.

 	if config.LXD.Enabled {
-		if config.LXD.SocketPath == "" {
+		if config.LXD.SocketPath == "" {
 			log.Printf("Warning: LXD is enabled but socket_path is not set; default will be used if available")
-		}
+		} else if _, err := os.Stat(config.LXD.SocketPath); err != nil {
+			log.Printf("Warning: LXD socket not found at '%s': %v", config.LXD.SocketPath, err)
+		}
 		if config.LXD.OperationTimeoutSeconds <= 0 {
 			log.Printf("Info: LXD operation_timeout_seconds not set; will use default 60 seconds")
 		}
 	}
api/internal/routes.go (1)

281-297: Conditional LXD wiring is fine; consider feature‑flag parity.

Other features use FeatureFlagMiddleware; if you need runtime toggling (beyond config), add a “lxd” feature flag for consistency.

Please confirm RBAC permissions include “lxd:*” actions so access is properly gated.

api/go.mod (3)

214-219: OpenTelemetry versions misaligned; align components to the same minor.

Core modules are at 1.38.0, but exporter otlptracehttp is at 1.35.0 (Line 217). Version skew can cause subtle runtime issues.

Apply:

- go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.35.0
+ go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.38.0

Then run go mod tidy and a quick smoke test.


7-7: Document or pin LXD dependency
Commit 3bb7626c5f70 exists but isn’t part of any release tag (latest is lxd-6.5 at fec3cc8…). If you need this commit’s changes, add a comment explaining why you’re using a pseudo-version; otherwise, consider pinning to the nearest stable tag (e.g., lxd-6.5).


105-105: Consolidate jwt imports to v5
Direct imports of v3 found in:

  • api/internal/features/github-connector/service/installation_token.go
  • api/internal/features/auth/utils/create_token.go
  • api/internal/features/auth/service/reset_password.go
    Migrate these to github.com/golang-jwt/jwt/v5 and remove the v3 dependency from go.mod.
api/internal/features/lxd/service/service.go (8)

105-112: Start: add input and early ctx checks.

Preempt invalid calls and avoid kicking off remote ops when ctx is already done.

 func (s *ClientService) Start(ctx context.Context, name string) error {
+	if name == "" {
+		return errors.New("name is required")
+	}
+	if err := ctx.Err(); err != nil {
+		return err
+	}
 	req := lxdapi.InstanceStatePut{Action: "start", Timeout: int(s.timeout.Seconds()), Force: false, Stateful: false}
 	op, err := s.client.UpdateInstanceState(name, req, "")
 	if err != nil {
 		return err
 	}
 	return waitOp(ctx, op, s.timeout)
 }

114-121: Stop: add input and early ctx checks.

Same rationale as Start.

 func (s *ClientService) Stop(ctx context.Context, name string, force bool) error {
+	if name == "" {
+		return errors.New("name is required")
+	}
+	if err := ctx.Err(); err != nil {
+		return err
+	}
 	req := lxdapi.InstanceStatePut{Action: "stop", Timeout: int(s.timeout.Seconds()), Force: force}
 	op, err := s.client.UpdateInstanceState(name, req, "")
 	if err != nil {
 		return err
 	}
 	return waitOp(ctx, op, s.timeout)
 }

123-134: Restart: add input and early ctx checks.

Avoid issuing restart when ctx is already done; validate name.

 func (s *ClientService) Restart(ctx context.Context, name string, timeout time.Duration) error {
+	if name == "" {
+		return errors.New("name is required")
+	}
+	if err := ctx.Err(); err != nil {
+		return err
+	}
 	to := int(s.timeout.Seconds())
 	if timeout > 0 {
 		to = int(timeout.Seconds())
 	}
 	req := lxdapi.InstanceStatePut{Action: "restart", Timeout: to, Force: true}
 	op, err := s.client.UpdateInstanceState(name, req, "")
 	if err != nil {
 		return err
 	}
 	return waitOp(ctx, op, time.Duration(to)*time.Second)
 }

97-103: Get: validate name and early ctx check.

Prevents unnecessary client calls and clearer errors.

 func (s *ClientService) Get(ctx context.Context, name string) (*lxdapi.Instance, error) {
+	if name == "" {
+		return nil, errors.New("name is required")
+	}
+	if err := ctx.Err(); err != nil {
+		return nil, err
+	}
 	inst, _, err := s.client.GetInstance(name)
 	if err != nil {
 		return nil, err
 	}
 	return inst, nil
 }

89-95: List: early ctx check.

Return fast if the request is already canceled.

 func (s *ClientService) List(ctx context.Context) ([]lxdapi.Instance, error) {
+	if err := ctx.Err(); err != nil {
+		return nil, err
+	}
 	instances, err := s.client.GetInstances(lxdapi.InstanceTypeAny)
 	if err != nil {
 		return nil, err
 	}
 	return instances, nil
 }

54-87: Create: early ctx check (and confirm profiles default behavior).

Avoid starting long ops if ctx is done. Also confirm whether empty profiles should default to "default" profile.

 func (s *ClientService) Create(ctx context.Context, name string, imageAlias string, profiles []string, config map[string]string, devices map[string]map[string]string) (*lxdapi.Instance, error) {
 	if name == "" {
 		return nil, errors.New("name is required")
 	}
 	if imageAlias == "" {
 		return nil, errors.New("image alias is required")
 	}
+	if err := ctx.Err(); err != nil {
+		return nil, err
+	}
 
 	req := lxdapi.InstancesPost{

Would you like Create to auto-apply the "default" profile when profiles is nil/empty?


146-161: DeleteAll: honor ctx during listing and per-item deletion.

Short-circuit quickly on cancellation, and re-check within the loop.

 func (s *ClientService) DeleteAll(ctx context.Context) error {
+	if err := ctx.Err(); err != nil {
+		return err
+	}
 	instances, err := s.client.GetInstances(lxdapi.InstanceTypeAny)
 	if err != nil {
 		return err
 	}
 	var errs []string
 	for _, inst := range instances {
+		if err := ctx.Err(); err != nil {
+			return err
+		}
 		if err := s.Delete(ctx, inst.Name); err != nil {
 			errs = append(errs, fmt.Sprintf("%s: %v", inst.Name, err))
 		}
 	}
 	if len(errs) > 0 {
 		return fmt.Errorf("failed to delete some instances: %s", strings.Join(errs, ", "))
 	}
 	return nil
 }

33-52: Socket default: add non-snap fallback or verify environment.

The default points to the snap path; on non-snap installs it’s commonly /var/lib/lxd/unix.socket. Consider a simple fallback.

 func New(socketPath string, project string, opTimeoutSec int) (*ClientService, error) {
 	if socketPath == "" {
-		// common default for snap
-		socketPath = "/var/snap/lxd/common/lxd/unix.socket"
+		// prefer snap path; fallback to non-snap
+		if _, statErr := os.Stat("/var/snap/lxd/common/lxd/unix.socket"); statErr == nil {
+			socketPath = "/var/snap/lxd/common/lxd/unix.socket"
+		} else {
+			socketPath = "/var/lib/lxd/unix.socket"
+		}
 	}

Add import:

import "os"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de27f30 and 15f6796.

📒 Files selected for processing (10)
  • api/go.mod (8 hunks)
  • api/internal/config/config.go (2 hunks)
  • api/internal/features/lxd/controller/controller.go (1 hunks)
  • api/internal/features/lxd/service/service.go (1 hunks)
  • api/internal/routes.go (3 hunks)
  • api/internal/types/types.go (2 hunks)
  • docker-compose-staging.yml (1 hunks)
  • docker-compose.yml (1 hunks)
  • helpers/config.dev.yaml (1 hunks)
  • helpers/config.prod.yaml (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
api/internal/types/types.go (2)
api/internal/storage/store.go (1)
  • App (19-23)
api/internal/config/config.go (1)
  • AppConfig (18-18)
api/internal/routes.go (5)
api/internal/features/lxd/service/service.go (1)
  • ClientService (26-30)
api/internal/config/config.go (1)
  • AppConfig (18-18)
api/internal/features/lxd/controller/controller.go (2)
  • Controller (13-15)
  • NewController (17-19)
api/internal/middleware/rbac.go (1)
  • RBACMiddleware (20-125)
api/internal/middleware/audit.go (1)
  • AuditMiddleware (20-106)
api/internal/features/lxd/controller/controller.go (2)
api/internal/features/lxd/service/service.go (1)
  • ClientService (26-30)
api/internal/types/response.go (1)
  • Response (5-20)
🔇 Additional comments (6)
api/internal/routes.go (2)

43-45: LXD imports look good and isolated.

Scoped under dedicated controller/service packages; aligns with optional wiring.


47-53: Late binding helpers are a clean way to avoid unused imports.

Keeps construction localized and testable.

api/go.mod (4)

35-36: x/crypto v0.43.0 & x/net v0.46.0: no OSV advisories found Both upgrades have no known vulnerabilities.


229-236: Compatibility verified with Go 1.25
Running go mod tidy -compat=1.25 in api/ completed without errors.


239-244: ****

This codebase has no .proto source files, no generated .pb.go files, and no direct usage of gRPC/protobuf packages in the code. The gRPC, protobuf, and genproto dependencies in go.mod are transitive (pulled in by other indirect dependencies). The review comment's concern about coordinating protoc, protoc-gen-go, and protoc-gen-go-grpc versions applies only when actively generating proto code—a practice not present in this project. No action is required on the dependency versions shown.

Likely an incorrect or invalid review comment.


21-21: Confirm SFTP v1.13.9 compatibility
All used methods (NewSftp, Close, Create, ReadDir, Open, Mkdir/All, Remove, Rename, Stat) retain identical signatures and behavior—no breakage detected.

module github.com/raghavyuva/nixopus-api

go 1.25
go 1.25.3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix go.mod directive: patch version not allowed; use toolchain for patch pin.

In go.mod, go must be major.minor (e.g., 1.25). To pin the patch toolchain, add a toolchain directive.

Apply:

-go 1.25.3
+go 1.25
+toolchain go1.25.3
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
go 1.25.3
go 1.25
toolchain go1.25.3
🤖 Prompt for AI Agents
In api/go.mod around line 3, the go directive incorrectly uses a patch version
("go 1.25.3"); change it to the major.minor form ("go 1.25") and add a toolchain
directive to pin the patch toolchain (e.g., add a top-level `toolchain` block
specifying the full version to pin the patch level). Ensure the go directive is
updated to `1.25` and add the toolchain stanza after the module/require blocks
as appropriate so the repository uses the intended patch toolchain.

Comment on lines +29 to +41
func (c *Controller) Create(ctx fuego.ContextWithBody[createRequest]) (*shared_types.Response, error) {
body, err := ctx.Body()
if err != nil {
return nil, fuego.HTTPError{Err: err, Status: http.StatusBadRequest}
}
reqCtx, cancel := context.WithTimeout(ctx.Request().Context(), 120*time.Second)
defer cancel()
inst, err := c.svc.Create(reqCtx, body.Name, body.Image, body.Profiles, body.Config, body.Devices)
if err != nil {
return nil, fuego.HTTPError{Err: err, Status: http.StatusInternalServerError}
}
return &shared_types.Response{Status: "success", Message: "created", Data: inst}, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Return 4xx for request-validation failures.

service.Create returns plain errors (e.g., “name is required”, “image alias is required”) for client input issues. As written, those bubble up here and we wrap them in a 500, incorrectly signaling a server fault. Please detect these validation errors and translate them to a 400 (or another appropriate 4xx) so clients receive the right signal.

🤖 Prompt for AI Agents
In api/internal/features/lxd/controller/controller.go around lines 29 to 41,
service.Create can return validation errors for bad client input which are
currently mapped to 500; change the handler to detect and translate those to 4xx
(400) responses. Implement detection by checking the returned error's type (use
errors.As or errors.Is against a ValidationError sentinel/type defined in the
service layer, or have the service return a custom ValidationError), and when it
matches return fuego.HTTPError{Err: err, Status: http.StatusBadRequest};
otherwise keep returning a 500 for other errors. Ensure you import "errors" if
needed and keep the rest of the flow (context timeout and success response)
unchanged.

Comment on lines +136 to +144
func (s *ClientService) Delete(ctx context.Context, name string) error {
// Ensure stopped
_ = s.Stop(ctx, name, true)
op, err := s.client.DeleteInstance(name)
if err != nil {
return err
}
return waitOp(ctx, op, s.timeout)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Respect ctx and don’t ignore Stop errors in Delete.

Ignoring Stop errors can mask ctx cancellation and proceed to delete while the op continues server-side. Validate input, honor ctx, and only proceed past Stop on non-ctx errors.

 func (s *ClientService) Delete(ctx context.Context, name string) error {
-	// Ensure stopped
-	_ = s.Stop(ctx, name, true)
+	if name == "" {
+		return errors.New("name is required")
+	}
+	if err := ctx.Err(); err != nil {
+		return err
+	}
+	// Ensure stopped (force), but propagate ctx cancellation
+	if err := s.Stop(ctx, name, true); err != nil {
+		if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
+			return err
+		}
+		// proceed for other errors; delete may still succeed
+	}
 	op, err := s.client.DeleteInstance(name)
 	if err != nil {
 		return err
 	}
 	return waitOp(ctx, op, s.timeout)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (s *ClientService) Delete(ctx context.Context, name string) error {
// Ensure stopped
_ = s.Stop(ctx, name, true)
op, err := s.client.DeleteInstance(name)
if err != nil {
return err
}
return waitOp(ctx, op, s.timeout)
}
func (s *ClientService) Delete(ctx context.Context, name string) error {
if name == "" {
return errors.New("name is required")
}
if err := ctx.Err(); err != nil {
return err
}
// Ensure stopped (force), but propagate ctx cancellation
if err := s.Stop(ctx, name, true); err != nil {
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
return err
}
// proceed for other errors; delete may still succeed
}
op, err := s.client.DeleteInstance(name)
if err != nil {
return err
}
return waitOp(ctx, op, s.timeout)
}
🤖 Prompt for AI Agents
In api/internal/features/lxd/service/service.go around lines 136–144, validate
the input name and stop ignoring errors from Stop: return early if ctx is
cancelled (check ctx.Err()) and propagate Stop errors instead of discarding
them; call s.Stop(ctx, name, true) and if it returns an error, if errors.Is(err,
context.Canceled) or errors.Is(err, context.DeadlineExceeded) return that ctx
error, otherwise return the Stop error (wrap with context if desired) and only
call s.client.DeleteInstance(name) when name is non-empty, ctx is still active,
and Stop succeeded; then proceed to waitOp as before.

Comment on lines +170 to +184
// waitOp waits on the LXD operation with a timeout respecting ctx
func waitOp(ctx context.Context, op lxdclient.Operation, timeout time.Duration) error {
done := make(chan error, 1)
go func() {
done <- op.Wait()
}()
select {
case err := <-done:
return err
case <-time.After(timeout):
return context.DeadlineExceeded
case <-ctx.Done():
return ctx.Err()
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix waitOp: cancel server op and avoid goroutine leak on timeout/cancel.

Current code returns on timeout/ctx without canceling the LXD operation or stopping the goroutine blocked on op.Wait(), risking goroutine leaks and leaving server ops running. Cancel on timeout/ctx and best‑effort drain.

-func waitOp(ctx context.Context, op lxdclient.Operation, timeout time.Duration) error {
-	done := make(chan error, 1)
-	go func() {
-		done <- op.Wait()
-	}()
-	select {
-	case err := <-done:
-		return err
-	case <-time.After(timeout):
-		return context.DeadlineExceeded
-	case <-ctx.Done():
-		return ctx.Err()
-	}
-}
+func waitOp(ctx context.Context, op lxdclient.Operation, timeout time.Duration) error {
+	done := make(chan error, 1)
+	go func() { done <- op.Wait() }()
+
+	timer := time.NewTimer(timeout)
+	defer timer.Stop()
+
+	select {
+	case err := <-done:
+		return err
+	case <-ctx.Done():
+		_ = op.Cancel() // best-effort cancel
+		// Drain or give up quickly to avoid goroutine leak
+		select {
+		case <-done:
+		case <-time.After(100 * time.Millisecond):
+		}
+		return ctx.Err()
+	case <-timer.C:
+		_ = op.Cancel() // best-effort cancel
+		select {
+		case <-done:
+		case <-time.After(100 * time.Millisecond):
+		}
+		return context.DeadlineExceeded
+	}
+}

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In api/internal/features/lxd/service/service.go around lines 170 to 184, the
waitOp function must cancel the server operation and avoid leaking the goroutine
blocked on op.Wait() when timeout or ctx cancellation occurs; modify the select
branches for timeout and ctx.Done() to call the operation cancel method (e.g.
op.Cancel() — use the actual cancel API on lxdclient.Operation), then start a
best‑effort goroutine to drain the done channel so the original Wait goroutine
can exit, and return the appropriate error; also handle any error returned by
the cancel call where useful but do not block waiting for it.

Comment on lines +458 to +474
// LXD lifecycle endpoints
func (router *Router) LXDRoutes(s *fuego.Server, lxdCtrl *lxdController.Controller) {
// List and create instances
fuego.Get(s, "", lxdCtrl.List)
fuego.Post(s, "", lxdCtrl.Create)

// Instance-specific operations
fuego.Get(s, "/{name}", lxdCtrl.Get)
fuego.Post(s, "/{name}/start", lxdCtrl.Start)
fuego.Post(s, "/{name}/stop", lxdCtrl.Stop)
fuego.Post(s, "/{name}/restart", lxdCtrl.Restart)
fuego.Delete(s, "/{name}", lxdCtrl.Delete)

// Bulk delete
fuego.Delete(s, "/all", lxdCtrl.DeleteAll)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Route specificity: register static “/all” before “/{name}”; recheck RBAC for mass delete.

Depending on router matching, “/all” can be shadowed by “/{name}”. Also, “DELETE /all” shares the same permission as deleting one instance; consider a stricter RBAC resource or separate group.

 func (router *Router) LXDRoutes(s *fuego.Server, lxdCtrl *lxdController.Controller) {
 	// List and create instances
 	fuego.Get(s, "", lxdCtrl.List)
 	fuego.Post(s, "", lxdCtrl.Create)
 
-	// Instance-specific operations
-	fuego.Get(s, "/{name}", lxdCtrl.Get)
-	fuego.Post(s, "/{name}/start", lxdCtrl.Start)
-	fuego.Post(s, "/{name}/stop", lxdCtrl.Stop)
-	fuego.Post(s, "/{name}/restart", lxdCtrl.Restart)
-	fuego.Delete(s, "/{name}", lxdCtrl.Delete)
-
 	// Bulk delete
 	fuego.Delete(s, "/all", lxdCtrl.DeleteAll)
+
+	// Instance-specific operations
+	fuego.Get(s, "/{name}", lxdCtrl.Get)
+	fuego.Post(s, "/{name}/start", lxdCtrl.Start)
+	fuego.Post(s, "/{name}/stop", lxdCtrl.Stop)
+	fuego.Post(s, "/{name}/restart", lxdCtrl.Restart)
+	fuego.Delete(s, "/{name}", lxdCtrl.Delete)
 }

Optional: mount “/all” under a stricter group (e.g., “/admin/all”) with a different RBAC resource if you want distinct permissions.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// LXD lifecycle endpoints
func (router *Router) LXDRoutes(s *fuego.Server, lxdCtrl *lxdController.Controller) {
// List and create instances
fuego.Get(s, "", lxdCtrl.List)
fuego.Post(s, "", lxdCtrl.Create)
// Instance-specific operations
fuego.Get(s, "/{name}", lxdCtrl.Get)
fuego.Post(s, "/{name}/start", lxdCtrl.Start)
fuego.Post(s, "/{name}/stop", lxdCtrl.Stop)
fuego.Post(s, "/{name}/restart", lxdCtrl.Restart)
fuego.Delete(s, "/{name}", lxdCtrl.Delete)
// Bulk delete
fuego.Delete(s, "/all", lxdCtrl.DeleteAll)
}
// LXD lifecycle endpoints
func (router *Router) LXDRoutes(s *fuego.Server, lxdCtrl *lxdController.Controller) {
// List and create instances
fuego.Get(s, "", lxdCtrl.List)
fuego.Post(s, "", lxdCtrl.Create)
// Bulk delete
fuego.Delete(s, "/all", lxdCtrl.DeleteAll)
// Instance-specific operations
fuego.Get(s, "/{name}", lxdCtrl.Get)
fuego.Post(s, "/{name}/start", lxdCtrl.Start)
fuego.Post(s, "/{name}/stop", lxdCtrl.Stop)
fuego.Post(s, "/{name}/restart", lxdCtrl.Restart)
fuego.Delete(s, "/{name}", lxdCtrl.Delete)
}
🤖 Prompt for AI Agents
In api/internal/routes.go around lines 458 to 474, the static DELETE /all route
can be shadowed by the parameterized /{name} route and the mass-delete is using
the same RBAC as single-instance deletes; reorder the route registrations so the
static "/all" is registered before "/{name}" to ensure it matches, and update
RBAC by either assigning a stricter permission resource to the mass-delete
handler or moving it under a protected path/group (e.g., "/admin/all") with a
distinct RBAC resource; also review and adjust any middleware/group registration
to enforce the new RBAC for the mass-delete endpoint.

- ./logs:/app/logs
- ${NIXOPUS_HOME:-/etc/nixopus}:/etc/nixopus
- /var/run/docker.sock:/var/run/docker.sock
- /var/snap/lxd/common/lxd/unix.socket:/var/snap/lxd/common/lxd/unix.socket
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Mount the LXD socket read‑only and make it opt‑in.

Same host risk applies if the path doesn’t exist; Docker will create a directory at that file path. Use :ro and consider moving to an override profile/file.

-      - /var/snap/lxd/common/lxd/unix.socket:/var/snap/lxd/common/lxd/unix.socket
+      - /var/snap/lxd/common/lxd/unix.socket:/var/snap/lxd/common/lxd/unix.socket:ro
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- /var/snap/lxd/common/lxd/unix.socket:/var/snap/lxd/common/lxd/unix.socket
- /var/snap/lxd/common/lxd/unix.socket:/var/snap/lxd/common/lxd/unix.socket:ro
🤖 Prompt for AI Agents
In docker-compose-staging.yml line 20, the LXD socket bind mount is currently
writable and always applied; change the mount to be read-only by appending :ro
(i.e.
/var/snap/lxd/common/lxd/unix.socket:/var/snap/lxd/common/lxd/unix.socket:ro),
remove it from the main compose file and instead put it in an opt-in override
(e.g. docker-compose.override.yml or a separate profile/compose fragment) so
teams must explicitly enable it, and document/validate that the host socket path
exists before starting to avoid Docker creating a directory at that path.

Comment on lines +19 to +20
# LXD unix socket, if not present on host, else mount=no-op directory)
- /var/snap/lxd/common/lxd/unix.socket:/var/snap/lxd/common/lxd/unix.socket
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Avoid creating a real directory at the socket path; make the mount opt‑in and read‑only.

If the host has no LXD, Docker will create a directory at /var/snap/lxd/common/lxd/unix.socket, which can break future LXD installs. Also mount the socket as read‑only.

Apply this minimal change:

-      # LXD unix socket, if not present on host, else mount=no-op directory)
-      - /var/snap/lxd/common/lxd/unix.socket:/var/snap/lxd/common/lxd/unix.socket
+      # LXD unix socket (opt-in; mount read-only to reduce risk)
+      - /var/snap/lxd/common/lxd/unix.socket:/var/snap/lxd/common/lxd/unix.socket:ro

Recommended: move this bind into an override file so it’s opt-in:

docker-compose.lxd.yml

services:
  nixopus-api:
    volumes:
      - /var/snap/lxd/common/lxd/unix.socket:/var/snap/lxd/common/lxd/unix.socket:ro

Run with:

  • docker compose -f docker-compose.yml -f docker-compose.lxd.yml up
🤖 Prompt for AI Agents
In docker-compose.yml around lines 19–20, the service currently binds the host
LXD socket path which causes Docker to create a real directory when LXD is
absent and is not read‑only; remove that bind from the main docker-compose.yml
and instead add an opt‑in override file (e.g., docker-compose.lxd.yml) that adds
the bind mount under the service volumes as
/var/snap/lxd/common/lxd/unix.socket:/var/snap/lxd/common/lxd/unix.socket:ro so
the mount is explicit, read‑only and only applied when the override file is
included with docker compose -f docker-compose.yml -f docker-compose.lxd.yml up.

Comment on lines +30 to +34
# TODO: LXD - include for ccli installation
LXD_ENABLED: ${LXD_ENABLED:-false}
LXD_SOCKET_PATH: ${LXD_SOCKET_PATH:-/var/snap/lxd/common/lxd/unix.socket}
LXD_PROJECT: ${LXD_PROJECT:-default}
LXD_OPERATION_TIMEOUT_SECONDS: ${LXD_OPERATION_TIMEOUT_SECONDS:-60}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor: fix comment typo; keep LXD envs, and ensure code has defaults.

Change “ccli” → “CLI”. Consider adding code‑level viper defaults as a safety net.

-      # TODO: LXD - include for ccli installation
+      # TODO: LXD - include for CLI installation
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# TODO: LXD - include for ccli installation
LXD_ENABLED: ${LXD_ENABLED:-false}
LXD_SOCKET_PATH: ${LXD_SOCKET_PATH:-/var/snap/lxd/common/lxd/unix.socket}
LXD_PROJECT: ${LXD_PROJECT:-default}
LXD_OPERATION_TIMEOUT_SECONDS: ${LXD_OPERATION_TIMEOUT_SECONDS:-60}
# TODO: LXD - include for CLI installation
LXD_ENABLED: ${LXD_ENABLED:-false}
LXD_SOCKET_PATH: ${LXD_SOCKET_PATH:-/var/snap/lxd/common/lxd/unix.socket}
LXD_PROJECT: ${LXD_PROJECT:-default}
LXD_OPERATION_TIMEOUT_SECONDS: ${LXD_OPERATION_TIMEOUT_SECONDS:-60}
🤖 Prompt for AI Agents
In helpers/config.dev.yaml around lines 30 to 34, fix the comment typo by
changing “ccli” to “CLI”, keep the existing LXD environment entries as-is, and
add corresponding Viper defaults in the application config initialization (e.g.,
set default values for LXD_ENABLED, LXD_SOCKET_PATH, LXD_PROJECT, and
LXD_OPERATION_TIMEOUT_SECONDS) so runtime code has a safety net if env vars are
missing; ensure the Viper keys match these env names and use the same default
values shown in the YAML.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
api/go.mod (1)

3-3: Restore valid go directive and pin toolchain separately.

go.mod rejects a patch-level go directive; it must be major.minor. Reintroduce go 1.25 and add a toolchain go1.25.3 stanza so the build actually uses 1.25.3.

-go 1.25.3
+go 1.25
+toolchain go1.25.3
🧹 Nitpick comments (1)
cli/app/commands/install/run.py (1)

487-490: Remove defensive getattr call.

Since include_lxd is always initialized in __init__ (line 116), the getattr with a default value is unnecessary.

Apply this diff:

-        # If user requested LXD during install, enable it in API env
-        if getattr(self, "include_lxd", False):
+        # If user requested LXD during install, enable it in API env
+        if self.include_lxd:
             if "LXD_ENABLED" in updated_env:
                 updated_env["LXD_ENABLED"] = "true"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 15f6796 and fb8880a.

⛔ Files ignored due to path filters (1)
  • api/go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • api/go.mod (8 hunks)
  • cli/app/commands/install/command.py (2 hunks)
  • cli/app/commands/install/run.py (5 hunks)
  • cli/pyproject.toml (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • cli/pyproject.toml
🧰 Additional context used
🧬 Code graph analysis (1)
cli/app/commands/install/run.py (2)
cli/app/utils/lib.py (2)
  • HostInformation (61-91)
  • get_os_name (63-64)
cli/tests/commands/preflight/test_deps.py (2)
  • warning (40-41)
  • info (37-38)
🔇 Additional comments (3)
cli/app/commands/install/run.py (2)

104-104: LGTM!

The parameter addition and initialization are correctly implemented with a safe default value.

Also applies to: 116-116


186-186: LGTM!

The installation step is appropriately positioned in the flow and clearly marked as optional.

cli/app/commands/install/command.py (1)

42-46: LGTM!

The CLI option is well-documented and correctly threaded through to the Install constructor. The help text clearly communicates that this is a Linux-only feature using a local unix socket.

Also applies to: 62-62

Comment on lines +520 to +559
def _install_lxd_if_enabled(self):
if not self.include_lxd:
return
os_name = HostInformation.get_os_name()
if os_name.lower() != "linux":
self.logger.warning("--include-lxd is supported only on Linux hosts; skipping")
return
# best-effort installation via snap
cmds = [
"sudo snap install lxd",
"sudo usermod -aG lxd $USER",
]
preseed = {
"config": {},
"networks": [{"name": "lxdbr0", "type": "bridge", "config": {"ipv4.address": "auto", "ipv6.address": "none"}}],
"storage_pools": [{"name": "default", "driver": "zfs", "config": {"size": "10GB"}}],
"profiles": [
{
"name": "default",
"config": {},
"devices": {"root": {"path": "/", "pool": "default", "type": "disk"}},
}
],
"cluster": {"server_name": "nixopus-host"},
}
if self.dry_run:
for c in cmds:
self.logger.info(f"[dry-run] Would run: {c}")
self.logger.info("[dry-run] Would run: lxd init --preseed <config>")
return
for c in cmds:
try:
subprocess.check_call(c, shell=True)
except Exception as e:
self.logger.warning(f"Failed command '{c}': {e}")
try:
p = subprocess.Popen(["lxd", "init", "--preseed"], stdin=subprocess.PIPE)
p.communicate(input=yaml.safe_dump(preseed).encode())
except Exception as e:
self.logger.warning(f"Failed to preseed LXD: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Improve subprocess security and error handling.

The implementation has several concerns:

  1. Using shell=True (line 552) is a security anti-pattern, even with hardcoded commands. Prefer the list form.
  2. The $USER variable expansion (line 530) relies on shell environment and may not behave as expected in all contexts.
  3. No validation that snap is available before attempting installation.
  4. Catching generic Exception instead of subprocess.CalledProcessError masks the actual error type.
  5. ZFS driver (line 535) requires specific kernel modules that may not be available on all Linux systems.
  6. No validation or feedback that preseed succeeded.

Consider these improvements:

def _install_lxd_if_enabled(self):
    if not self.include_lxd:
        return
    os_name = HostInformation.get_os_name()
    if os_name.lower() != "linux":
        self.logger.warning("--include-lxd is supported only on Linux hosts; skipping")
        return
    
    # Check if snap is available
    if not shutil.which("snap"):
        self.logger.warning("snap not found; skipping LXD installation")
        return
    
    # Use list form instead of shell=True for security
    import getpass
    current_user = getpass.getuser()
    cmds = [
        ["sudo", "snap", "install", "lxd"],
        ["sudo", "usermod", "-aG", "lxd", current_user],
    ]
    preseed = {
        "config": {},
        "networks": [{"name": "lxdbr0", "type": "bridge", "config": {"ipv4.address": "auto", "ipv6.address": "none"}}],
        "storage_pools": [{"name": "default", "driver": "dir"}],  # Use 'dir' instead of 'zfs' for broader compatibility
        "profiles": [
            {
                "name": "default",
                "config": {},
                "devices": {"root": {"path": "/", "pool": "default", "type": "disk"}},
            }
        ],
        "cluster": {"server_name": "nixopus-host"},
    }
    if self.dry_run:
        for c in cmds:
            self.logger.info(f"[dry-run] Would run: {' '.join(c)}")
        self.logger.info("[dry-run] Would run: lxd init --preseed <config>")
        return
    
    for c in cmds:
        try:
            subprocess.check_call(c)  # No shell=True
            self.logger.info(f"Successfully executed: {' '.join(c)}")
        except subprocess.CalledProcessError as e:
            self.logger.warning(f"Failed command '{' '.join(c)}': {e}")
            return  # Stop on first failure
    
    try:
        p = subprocess.Popen(["lxd", "init", "--preseed"], stdin=subprocess.PIPE)
        stdout, stderr = p.communicate(input=yaml.safe_dump(preseed).encode(), timeout=30)
        if p.returncode != 0:
            self.logger.warning(f"LXD preseed failed with code {p.returncode}")
        else:
            self.logger.info("LXD installation and initialization completed successfully")
    except subprocess.TimeoutExpired:
        p.kill()
        self.logger.warning("LXD preseed timed out")
    except Exception as e:
        self.logger.warning(f"Failed to preseed LXD: {e}")
    
    self.logger.info("You may need to log out/in for group changes to apply.")

Key improvements:

  • Validates snap availability before proceeding
  • Uses list form instead of shell=True for better security
  • Uses getpass.getuser() instead of shell $USER expansion
  • Uses dir driver instead of zfs for broader compatibility
  • Catches specific subprocess.CalledProcessError instead of generic Exception
  • Adds timeout to preseed operation
  • Provides success/failure feedback
  • Stops on first command failure instead of continuing
🤖 Prompt for AI Agents
In cli/app/commands/install/run.py around lines 520 to 559, the LXD install
routine should be hardened: verify snap exists (shutil.which) and return with a
warning if missing; replace shell=True calls with list-form args and use
getpass.getuser() instead of "$USER"; avoid continuing on failure by catching
subprocess.CalledProcessError (not broad Exception) and returning on first
failed command while logging details; prefer the "dir" storage driver for wider
compatibility or detect ZFS kernel support before choosing "zfs"; run lxd init
--preseed with subprocess.communicate including a timeout, check p.returncode
and log success/failure, and finally log that the user may need to re-login for
group changes to apply.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
api/go.mod (1)

3-3: Go directive must be major.minor; pin patch via toolchain.

go.mod only allows go 1.25 format. Keep the patch pin by adding a toolchain go1.25.3 line after the module header.

🧹 Nitpick comments (3)
cli/app/commands/install/run.py (2)

186-188: Consider guarding and time-bounding the LXD step

This step can invoke sudo and external installers; add timeouts inside the step and skip early when prerequisites are missing (e.g., no snap). This avoids hangs and improves reliability.


487-491: Gate LXD env on OS and intent

Avoid enabling LXD on non-Linux or when install is skipped. Add an OS check to prevent accidental enablement.

-        # If user requested LXD during install, enable it in API env
-        if getattr(self, "include_lxd", False):
-            if "LXD_ENABLED" in updated_env:
-                updated_env["LXD_ENABLED"] = "true"
+        # If user requested LXD and we're on Linux, enable it in API env
+        if getattr(self, "include_lxd", False) and HostInformation.get_os_name().lower() == "linux":
+            if "LXD_ENABLED" in updated_env:
+                updated_env["LXD_ENABLED"] = "true"
cli/app/commands/install/command.py (1)

42-46: CLI flag addition is clear

Flag reads well. Consider noting that a re-login may be required for group changes.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 15f6796 and fb8880a.

⛔ Files ignored due to path filters (1)
  • api/go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • api/go.mod (8 hunks)
  • cli/app/commands/install/command.py (2 hunks)
  • cli/app/commands/install/run.py (5 hunks)
  • cli/pyproject.toml (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
cli/app/commands/install/run.py (2)
cli/app/utils/lib.py (2)
  • HostInformation (61-91)
  • get_os_name (63-64)
cli/tests/commands/preflight/test_deps.py (2)
  • warning (40-41)
  • info (37-38)
cli/app/commands/install/command.py (1)
cli/app/commands/install/run.py (1)
  • Install (91-560)
🔇 Additional comments (2)
cli/app/commands/install/run.py (1)

104-117: Flag plumbing looks good

include_lxd is correctly added and stored. No issues.

cli/app/commands/install/command.py (1)

62-63: Wiring include_lxd through Install

Correctly propagated to the installer.

Comment on lines +520 to +560
def _install_lxd_if_enabled(self):
if not self.include_lxd:
return
os_name = HostInformation.get_os_name()
if os_name.lower() != "linux":
self.logger.warning("--include-lxd is supported only on Linux hosts; skipping")
return
# best-effort installation via snap
cmds = [
"sudo snap install lxd",
"sudo usermod -aG lxd $USER",
]
preseed = {
"config": {},
"networks": [{"name": "lxdbr0", "type": "bridge", "config": {"ipv4.address": "auto", "ipv6.address": "none"}}],
"storage_pools": [{"name": "default", "driver": "zfs", "config": {"size": "10GB"}}],
"profiles": [
{
"name": "default",
"config": {},
"devices": {"root": {"path": "/", "pool": "default", "type": "disk"}},
}
],
"cluster": {"server_name": "nixopus-host"},
}
if self.dry_run:
for c in cmds:
self.logger.info(f"[dry-run] Would run: {c}")
self.logger.info("[dry-run] Would run: lxd init --preseed <config>")
return
for c in cmds:
try:
subprocess.check_call(c, shell=True)
except Exception as e:
self.logger.warning(f"Failed command '{c}': {e}")
try:
p = subprocess.Popen(["lxd", "init", "--preseed"], stdin=subprocess.PIPE)
p.communicate(input=yaml.safe_dump(preseed).encode())
except Exception as e:
self.logger.warning(f"Failed to preseed LXD: {e}")
self.logger.info("LXD installation attempted. You may need to log out/in for group changes to apply.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

LXD install flow: user targeting, timeouts, sudo, and storage backend robustness

Current flow has several reliability pitfalls:

  • Uses $USER under sudo; may modify root instead of the invoking user.
  • No timeouts; sudo prompts can hang the installer.
  • Runs lxd init without sudo; group changes won’t apply until re-login, making init likely to fail.
  • Assumes ZFS; commonly unavailable; preseed should default to a portable driver (dir) or detect availability.
  • Uses shell=True and ignores return codes.

Suggest a safer, portable, and time-bounded implementation.

-    def _install_lxd_if_enabled(self):
-        if not self.include_lxd:
-            return
-        os_name = HostInformation.get_os_name()
-        if os_name.lower() != "linux":
-            self.logger.warning("--include-lxd is supported only on Linux hosts; skipping")
-            return
-        # best-effort installation via snap
-        cmds = [
-            "sudo snap install lxd",
-            "sudo usermod -aG lxd $USER",
-        ]
-        preseed = {
-            "config": {},
-            "networks": [{"name": "lxdbr0", "type": "bridge", "config": {"ipv4.address": "auto", "ipv6.address": "none"}}],
-            "storage_pools": [{"name": "default", "driver": "zfs", "config": {"size": "10GB"}}],
-            "profiles": [
-                {
-                    "name": "default",
-                    "config": {},
-                    "devices": {"root": {"path": "/", "pool": "default", "type": "disk"}},
-                }
-            ],
-            "cluster": {"server_name": "nixopus-host"},
-        }
-        if self.dry_run:
-            for c in cmds:
-                self.logger.info(f"[dry-run] Would run: {c}")
-            self.logger.info("[dry-run] Would run: lxd init --preseed <config>")
-            return
-        for c in cmds:
-            try:
-                subprocess.check_call(c, shell=True)
-            except Exception as e:
-                self.logger.warning(f"Failed command '{c}': {e}")
-        try:
-            p = subprocess.Popen(["lxd", "init", "--preseed"], stdin=subprocess.PIPE)
-            p.communicate(input=yaml.safe_dump(preseed).encode())
-        except Exception as e:
-            self.logger.warning(f"Failed to preseed LXD: {e}")
-        self.logger.info("LXD installation attempted. You may need to log out/in for group changes to apply.")
+    def _install_lxd_if_enabled(self):
+        if not self.include_lxd:
+            return
+        if HostInformation.get_os_name().lower() != "linux":
+            self.logger.warning("--include-lxd is supported only on Linux hosts; skipping")
+            return
+
+        import getpass  # local import to avoid top-level dependency
+
+        def _run(cmd, timeout: int = 300, check: bool = True, **kwargs) -> bool:
+            try:
+                subprocess.run(cmd, check=check, timeout=timeout, **kwargs)
+                return True
+            except subprocess.TimeoutExpired:
+                self.logger.warning(f"Timed out: {' '.join(cmd)}")
+                return False
+            except subprocess.CalledProcessError as e:
+                self.logger.warning(f"Command failed ({e.returncode}): {' '.join(cmd)}")
+                return False
+            except Exception as e:
+                self.logger.warning(f"Command error for {' '.join(cmd)}: {e}")
+                return False
+
+        # Pre-checks
+        lxd_present = HostInformation.command_exists("lxd")
+        snap_present = HostInformation.command_exists("snap")
+
+        if self.dry_run:
+            if not lxd_present:
+                if snap_present:
+                    self.logger.info("[dry-run] Would run: sudo snap install lxd")
+                else:
+                    self.logger.info("[dry-run] snap not found; cannot auto-install LXD")
+            target_user = os.environ.get("SUDO_USER") or getpass.getuser()
+            if target_user != "root":
+                self.logger.info(f"[dry-run] Would run: sudo usermod -aG lxd {target_user}")
+            self.logger.info("[dry-run] Would run: sudo lxd init --preseed <config>")
+            return
+
+        # Install LXD via snap if missing
+        if not lxd_present:
+            if not snap_present:
+                self.logger.warning("snap not found and LXD not installed; skipping automatic LXD setup")
+                return
+            if not _run(["sudo", "snap", "install", "lxd"], timeout=self.timeout):
+                # If install fails, don't proceed to init
+                return
+
+        # Ensure invoking user is in lxd group (avoid adding root by mistake)
+        target_user = os.environ.get("SUDO_USER") or getpass.getuser()
+        if target_user != "root":
+            _run(["sudo", "usermod", "-aG", "lxd", target_user], timeout=60, check=False)
+
+        # Portable preseed: default to 'dir' storage to avoid ZFS dependency
+        preseed = {
+            "config": {},
+            "networks": [
+                {
+                    "name": "lxdbr0",
+                    "type": "bridge",
+                    "config": {"ipv4.address": "auto", "ipv6.address": "none"},
+                }
+            ],
+            "storage_pools": [{"name": "default", "driver": "dir", "config": {}}],
+            "profiles": [
+                {
+                    "name": "default",
+                    "config": {},
+                    "devices": {"root": {"path": "/", "pool": "default", "type": "disk"}},
+                }
+            ],
+            "cluster": {"server_name": "nixopus-host"},
+        }
+
+        # Initialize (use sudo to avoid group-membership race)
+        try:
+            yaml_bytes = yaml.safe_dump(preseed).encode()
+            completed = subprocess.run(
+                ["sudo", "lxd", "init", "--preseed"],
+                input=yaml_bytes,
+                check=True,
+                timeout=self.timeout,
+            )
+        except subprocess.CalledProcessError as e:
+            self.logger.warning(f"Failed to preseed LXD (exit {e.returncode})")
+        except subprocess.TimeoutExpired:
+            self.logger.warning("Timed out running: sudo lxd init --preseed")
+        except Exception as e:
+            self.logger.warning(f"Failed to preseed LXD: {e}")
+
+        self.logger.info("LXD installation attempted. You may need to log out/in for group changes to apply.")

Notes:

  • Uses SUDO_USER or the invoking user to modify the correct account.
  • Avoids shell=True; passes args as a list.
  • Adds timeouts to prevent hangs.
  • Initializes via sudo to avoid group-membership race.
  • Defaults storage driver to dir to work broadly; you can enhance by detecting zfs/btrfs availability later.

Based on relevant snippets (HostInformation.command_exists).

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In cli/app/commands/install/run.py around lines 520-560, the LXD install flow is
fragile: change usermod to target the invoking user (use SUDO_USER if present,
fallback to os.environ['USER'] or getpass.getuser()), avoid shell=True by
passing commands as argument lists, use subprocess.run with a reasonable timeout
and check return codes (log and abort/continue appropriately), run the
post-install lxd init under sudo (e.g. ["sudo","-E","lxd","init","--preseed"])
so group changes don't block it, and default the storage_pools driver to "dir"
(or detect zfs/btrfs via HostInformation.command_exists before choosing zfs)
rather than hardcoding zfs; also ensure all exceptions log full details and that
dry-run mirrors the adjusted commands and preseed behavior.

Copy link
Owner

@raghavyuva raghavyuva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! minor changes

// TODO: @zhravan Remove once LXD is mandated
if config.LXD.Enabled {
if config.LXD.SocketPath == "" {
log.Printf("Warning: LXD is enabled but socket_path is not set; default will be used if available")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger.Debug can be used here instead

shared_types "github.com/raghavyuva/nixopus-api/internal/types"
)

type Controller struct {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's follow the same init.go types controller service folder structure as usual

return &Controller{svc: svc}
}

type createRequest struct {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to types/init.go

})
router.ContainerRoutes(containerGroup, containerController)

//TODO: Keeping LXD routes optional for now
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move out to seperate function

LogsPath string `mapstructure:"logs_path"`
}

type LXDConfig struct {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep all the LXD configs optional, do not mandate,

# TODO: LXD - optional for now, mandatory include for cli installation
LXD_ENABLED: ${LXD_ENABLED:-false}
LXD_SOCKET_PATH: ${LXD_SOCKET_PATH:-/var/snap/lxd/common/lxd/unix.socket}
LXD_PROJECT: ${LXD_PROJECT:-default}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nixopus_lxd

Copy link
Collaborator Author

@zhravan zhravan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LXD installation setup via API

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants