Skip to content

Feature/auth and deployment polish#13

Merged
VibhavSetlur merged 6 commits intodevelopfrom
feature/auth-and-deployment-polish
Feb 13, 2026
Merged

Feature/auth and deployment polish#13
VibhavSetlur merged 6 commits intodevelopfrom
feature/auth-and-deployment-polish

Conversation

@VibhavSetlur
Copy link
Collaborator

This pull request introduces a new in-memory logging system for the backend, allowing recent logs to be accessed via a new API endpoint. It also adds comprehensive documentation and scripts for deployment and integration testing, improving observability and test coverage. The most significant changes are grouped below:

In-memory logging and log API:

  • Added a custom MemoryLogHandler in app/services/logger.py to store up to 1000 recent log entries in memory, with a setup function to attach it to the root logger. (F3833d7R1)
  • Integrated logging setup into the FastAPI app lifespan in app/main.py, ensuring logs are captured from startup.
  • Added a new /system/logs endpoint in app/routes.py to retrieve recent logs, with optional filtering by log level and limit. Includes a LogEntry response model. [1] [2] [3]

Testing and deployment documentation:

  • Added scripts/test_deployment.sh and its documentation in scripts/README.md, providing a comprehensive deployment test suite covering all major features, authentication methods, and object-specific tests.
  • Added scripts/run_e2e.sh, a full end-to-end test script simulating deployment scenarios and validating data integrity, authentication, and multi-database support.
  • Created tests/integration/README.md describing available integration test scripts, usage, features, and troubleshooting, and referencing the consolidated deployment test.

Other improvements:

  • Updated app/utils/__init__.py to clarify that KBase Workspace API interactions now use direct HTTP requests instead of KBUtilLib.… tests
    This pull request introduces a new in-memory logging system with a corresponding system logs API endpoint, improves developer documentation and deployment testing, and cleans up legacy dependencies. The main highlights are the addition of a /system/logs endpoint for real-time log access, a new MemoryLogHandler for capturing logs in memory, and comprehensive end-to-end deployment test scripts for validating backend functionality.

System Logging and API Enhancements:

  • Added a MemoryLogHandler in app/services/logger.py to capture and store recent log records in memory, along with utility functions for setup and retrieval.
  • Integrated logging setup in the application startup (setup_logging() in app/main.py) and exposed a new /system/logs endpoint in app/routes.py for retrieving recent logs, with support for filtering and limiting results. [1] [2] [3]

Developer Experience and Testing:

  • Added a comprehensive deployment test script (scripts/run_e2e.sh) to perform end-to-end backend validation, including authentication, table access, pagination, statistics, and multi-database support.
  • Introduced a scripts/README.md to document available scripts and describe the deployment test workflow in detail.

Dependency and Documentation Cleanup:

  • Removed the external KBUtilLib dependency from the Dockerfile, reflecting a move to direct HTTP requests for KBase Workspace API interactions. [1] [2]

These changes improve observability, streamline deployment validation, and simplify the codebase for future maintenance.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces an in-memory logging system with a /system/logs API endpoint, adds comprehensive deployment testing infrastructure through bash and Python scripts, and migrates from KBUtilLib to direct HTTP requests for KBase Workspace API interactions. The changes improve observability and testing capabilities but introduce several critical security and reliability issues that must be addressed.

Changes:

  • Added in-memory logging system (MemoryLogHandler) with /system/logs endpoint for real-time log access
  • Removed KBUtilLib dependency and migrated to direct HTTP requests with requests.Session(trust_env=False)
  • Added 10+ test scripts covering deployment validation, integration testing, and end-to-end scenarios
  • Updated documentation with testing guides and deployment instructions

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 33 comments.

Show a summary per file
File Description
app/services/logger.py New in-memory log handler storing up to 1000 recent log entries
app/routes.py Added /system/logs endpoint to retrieve recent logs with filtering
app/main.py Integrated logging setup into app lifespan
app/utils/workspace.py Migrated from KBUtilLib to direct HTTP API calls with custom session handling
app/utils/init.py Updated documentation to reflect direct HTTP usage
Dockerfile Removed KBUtilLib cloning and PYTHONPATH configuration
validate_deployment.sh New deployment validation script with health checks and API tests
tests/integration/*.sh Multiple integration test scripts for comprehensive backend validation
scripts/test_deployment.sh Comprehensive deployment test suite with 13 test categories
scripts/run_e2e.sh End-to-end test script for deployment simulation
scripts/README.md Documentation for deployment test workflow
tests/integration/README.md Integration test usage guide and troubleshooting
tests/integration/test_deployment.py Python-based deployment integration tests
tests/integration/test_docker_deployment.py Docker environment simulation tests
scripts/test_production_ready.py Production readiness validation script

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

def __init__(
self,
token: str,
token: str | None,
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type annotation 'str | None' uses Python 3.10+ union syntax, but the Dockerfile specifies python3.13-alpine which supports this. However, the inconsistency with line 8's 'str | None' vs line 30's 'str | None' and line 244's 'str | None' in routes.py suggests the codebase may have mixed usage. Verify that all deployment targets support Python 3.10+ or use 'Optional[str]' for broader compatibility.

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +29
except Exception:
self.handleError(record)
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handleError method call on line 29 will be called when emit() encounters an exception, but the parent class's handleError will silently ignore it by default. This could mask logging failures. Consider adding explicit error logging or raising to ensure logging issues are visible during development and debugging.

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +45
# Get token from .env
TOKEN=$(grep KB_SERVICE_AUTH_TOKEN .env | cut -d'=' -f2 | tr -d '"' | tr -d "'")
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TOKEN variable extraction uses a fragile parsing approach that could fail if the .env file format varies. Line 45 uses cut/tr to parse the token, but doesn't handle cases where the value is quoted with spaces inside quotes (e.g., 'KB_SERVICE_AUTH_TOKEN="token value"'). Consider using a more robust parsing method or a dedicated .env file parser.

Suggested change
# Get token from .env
TOKEN=$(grep KB_SERVICE_AUTH_TOKEN .env | cut -d'=' -f2 | tr -d '"' | tr -d "'")
# Get token from .env (handle spaces and quoted values robustly)
TOKEN=$(awk '
/^[[:space:]]*KB_SERVICE_AUTH_TOKEN[[:space:]]*=/ {
line = $0
# Remove the key and '=' (with optional surrounding spaces)
sub(/^[[:space:]]*KB_SERVICE_AUTH_TOKEN[[:space:]]*=/, "", line)
# Trim leading/trailing whitespace
gsub(/^[[:space:]]*/, "", line)
gsub(/[[:space:]]*$/, "", line)
# Strip a single pair of surrounding quotes, if present
if (line ~ /^".*"$/ || line ~ /^'\''.*'\''$/) {
line = substr(line, 2, length(line) - 2)
}
print line
exit
}' .env)

Copilot uses AI. Check for mistakes.
Comment on lines +113 to +115
# Use a session to ensure clean connection handling and bypass proxies
with requests.Session() as session:
session.trust_env = False # Critical for avoiding stalls
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting trust_env=False disables reading proxy settings from environment variables (HTTP_PROXY, HTTPS_PROXY, etc.). While the comment indicates this is intentional to avoid stalls, this could break deployments in environments that require proxy configuration. Consider making this configurable via a setting rather than hardcoding, or document this behavior clearly for deployment scenarios.

Suggested change
# Use a session to ensure clean connection handling and bypass proxies
with requests.Session() as session:
session.trust_env = False # Critical for avoiding stalls
# Use a session to ensure clean connection handling and (by default) bypass proxies
with requests.Session() as session:
# Allow overriding trust_env via environment for proxy-dependent deployments.
# Default remains False to avoid stalls in environments where proxies misbehave.
trust_env_override = os.getenv("WORKSPACE_TRUST_ENV")
if trust_env_override is not None:
session.trust_env = trust_env_override.lower() in ("1", "true", "yes", "on")
else:
session.trust_env = False

Copilot uses AI. Check for mistakes.
Comment on lines +47 to +59
def _workspace_auth_header(self) -> str:
"""Return Authorization header value.

Args:
handle_ref: Handle ID (KBH_xxxxx format)
target_path: Where to save the file

Returns:
Path to downloaded file
KBase workspace API expects just the token string, NOT "Bearer <token>".
This method ensures we send the raw token without any prefix.
"""
# Ensure directory exists
target_path = Path(target_path)
target_path.parent.mkdir(parents=True, exist_ok=True)

if self._use_kbutillib and self._client:
try:
# Ensure KBUtilLib has the token set
if hasattr(self._client, 'save_token'):
self._client.save_token(self.token, namespace="kbase")
result = self._client.download_blob_file(handle_ref, str(target_path))
if result:
return Path(result)
except Exception as e:
logger.warning(f"KBUtilLib download_blob_file failed: {e}. Using fallback.", exc_info=True)

return Path(self._download_blob_fallback(handle_ref, str(target_path)))

# =========================================================================
# FALLBACK METHODS (Direct API calls)
# =========================================================================

def _workspace_auth_header(self) -> str:
"""Return Authorization header value for KBase Workspace API. Workspace expects Bearer token."""
t = self.token or ""
if t.startswith("Bearer ") or t.startswith("OAuth "):
return t
return f"Bearer {t}" if t else ""
if not self.token:
return ""
# Ensure we strip any Bearer prefix that might have been passed incorrectly
token = self.token.strip()
if token.startswith("Bearer "):
token = token[7:].strip()
return token
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _workspace_auth_header method strips "Bearer " prefix from tokens (lines 56-58), but this may break authentication if the KBase Workspace API actually expects "Bearer " in some contexts. The comment on line 50 states the API expects just the token, but this should be verified. If both formats are valid in different contexts, the method should preserve the original format or clearly document which format is expected where.

Copilot uses AI. Check for mistakes.
from pathlib import Path
from typing import Any
import time
import shutil
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'shutil' is not used.

Suggested change
import shutil

Copilot uses AI. Check for mistakes.
import requests
from app.utils.cache import get_upa_cache_path
from pathlib import Path
from typing import Any, Optional
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'Optional' is not used.

Suggested change
from typing import Any, Optional
from typing import Any

Copilot uses AI. Check for mistakes.

def test_file_permissions(self):
"""Test that cache directory has correct permissions."""
app = create_app()
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable app is not used.

Suggested change
app = create_app()
create_app()

Copilot uses AI. Check for mistakes.
print('=' * 70)

# Test 2: Workspace access
pangenome_data = test_workspace_access(upa)
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable pangenome_data is not used.

Suggested change
pangenome_data = test_workspace_access(upa)
test_workspace_access(upa)

Copilot uses AI. Check for mistakes.
Comment on lines +38 to +43
level = level.lower()
# Filter by level severity logic could be added here,
# for now just exact match or simple inclusion
# But usually we want min level.
# Let's just return all for clients to filter or implement simple filtering
pass
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary 'pass' statement.

Suggested change
level = level.lower()
# Filter by level severity logic could be added here,
# for now just exact match or simple inclusion
# But usually we want min level.
# Let's just return all for clients to filter or implement simple filtering
pass
# Normalize level; actual filtering can be implemented here later.
level = level.lower()

Copilot uses AI. Check for mistakes.
@VibhavSetlur VibhavSetlur merged commit ca07aca into develop Feb 13, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants