Feature/auth and deployment polish by VibhavSetlur · Pull Request #13 · kbase/tablescanner

VibhavSetlur · 2026-02-13T17:15:14Z

This pull request introduces a new in-memory logging system for the backend, allowing recent logs to be accessed via a new API endpoint. It also adds comprehensive documentation and scripts for deployment and integration testing, improving observability and test coverage. The most significant changes are grouped below:

In-memory logging and log API:

Added a custom MemoryLogHandler in app/services/logger.py to store up to 1000 recent log entries in memory, with a setup function to attach it to the root logger. (F3833d7R1)
Integrated logging setup into the FastAPI app lifespan in app/main.py, ensuring logs are captured from startup.
Added a new /system/logs endpoint in app/routes.py to retrieve recent logs, with optional filtering by log level and limit. Includes a LogEntry response model. [1] [2] [3]

Testing and deployment documentation:

Added scripts/test_deployment.sh and its documentation in scripts/README.md, providing a comprehensive deployment test suite covering all major features, authentication methods, and object-specific tests.
Added scripts/run_e2e.sh, a full end-to-end test script simulating deployment scenarios and validating data integrity, authentication, and multi-database support.
Created tests/integration/README.md describing available integration test scripts, usage, features, and troubleshooting, and referencing the consolidated deployment test.

Other improvements:

Updated app/utils/__init__.py to clarify that KBase Workspace API interactions now use direct HTTP requests instead of KBUtilLib.… tests
This pull request introduces a new in-memory logging system with a corresponding system logs API endpoint, improves developer documentation and deployment testing, and cleans up legacy dependencies. The main highlights are the addition of a /system/logs endpoint for real-time log access, a new MemoryLogHandler for capturing logs in memory, and comprehensive end-to-end deployment test scripts for validating backend functionality.

System Logging and API Enhancements:

Added a MemoryLogHandler in app/services/logger.py to capture and store recent log records in memory, along with utility functions for setup and retrieval.
Integrated logging setup in the application startup (setup_logging() in app/main.py) and exposed a new /system/logs endpoint in app/routes.py for retrieving recent logs, with support for filtering and limiting results. [1] [2] [3]

Developer Experience and Testing:

Added a comprehensive deployment test script (scripts/run_e2e.sh) to perform end-to-end backend validation, including authentication, table access, pagination, statistics, and multi-database support.
Introduced a scripts/README.md to document available scripts and describe the deployment test workflow in detail.

Dependency and Documentation Cleanup:

Removed the external KBUtilLib dependency from the Dockerfile, reflecting a move to direct HTTP requests for KBase Workspace API interactions. [1] [2]

These changes improve observability, streamline deployment validation, and simplify the codebase for future maintenance.

… tests

Copilot

Pull request overview

This pull request introduces an in-memory logging system with a /system/logs API endpoint, adds comprehensive deployment testing infrastructure through bash and Python scripts, and migrates from KBUtilLib to direct HTTP requests for KBase Workspace API interactions. The changes improve observability and testing capabilities but introduce several critical security and reliability issues that must be addressed.

Changes:

Added in-memory logging system (MemoryLogHandler) with /system/logs endpoint for real-time log access
Removed KBUtilLib dependency and migrated to direct HTTP requests with requests.Session(trust_env=False)
Added 10+ test scripts covering deployment validation, integration testing, and end-to-end scenarios
Updated documentation with testing guides and deployment instructions

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 33 comments.

Show a summary per file

File	Description
app/services/logger.py	New in-memory log handler storing up to 1000 recent log entries
app/routes.py	Added `/system/logs` endpoint to retrieve recent logs with filtering
app/main.py	Integrated logging setup into app lifespan
app/utils/workspace.py	Migrated from KBUtilLib to direct HTTP API calls with custom session handling
app/utils/init.py	Updated documentation to reflect direct HTTP usage
Dockerfile	Removed KBUtilLib cloning and PYTHONPATH configuration
validate_deployment.sh	New deployment validation script with health checks and API tests
tests/integration/*.sh	Multiple integration test scripts for comprehensive backend validation
scripts/test_deployment.sh	Comprehensive deployment test suite with 13 test categories
scripts/run_e2e.sh	End-to-end test script for deployment simulation
scripts/README.md	Documentation for deployment test workflow
tests/integration/README.md	Integration test usage guide and troubleshooting
tests/integration/test_deployment.py	Python-based deployment integration tests
tests/integration/test_docker_deployment.py	Docker environment simulation tests
scripts/test_production_ready.py	Production readiness validation script

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-13T17:22:43Z

app/utils/workspace.py

    def __init__(
        self,
-        token: str,
+        token: str | None,


The type annotation 'str | None' uses Python 3.10+ union syntax, but the Dockerfile specifies python3.13-alpine which supports this. However, the inconsistency with line 8's 'str | None' vs line 30's 'str | None' and line 244's 'str | None' in routes.py suggests the codebase may have mixed usage. Verify that all deployment targets support Python 3.10+ or use 'Optional[str]' for broader compatibility.

Copilot · 2026-02-13T17:22:43Z

app/services/logger.py

+        except Exception:
+            self.handleError(record)


The handleError method call on line 29 will be called when emit() encounters an exception, but the parent class's handleError will silently ignore it by default. This could mask logging failures. Consider adding explicit error logging or raising to ensure logging issues are visible during development and debugging.

Copilot · 2026-02-13T17:22:44Z

tests/integration/verify_deployment.sh

+# Get token from .env
+TOKEN=$(grep KB_SERVICE_AUTH_TOKEN .env | cut -d'=' -f2 | tr -d '"' | tr -d "'")


The TOKEN variable extraction uses a fragile parsing approach that could fail if the .env file format varies. Line 45 uses cut/tr to parse the token, but doesn't handle cases where the value is quoted with spaces inside quotes (e.g., 'KB_SERVICE_AUTH_TOKEN="token value"'). Consider using a more robust parsing method or a dedicated .env file parser.

Suggested change

# Get token from .env

TOKEN=$(grep KB_SERVICE_AUTH_TOKEN .env | cut -d'=' -f2 | tr -d '"' | tr -d "'")

# Get token from .env (handle spaces and quoted values robustly)

TOKEN=$(awk '

/^[[:space:]]*KB_SERVICE_AUTH_TOKEN[[:space:]]*=/ {

line = $0

# Remove the key and '=' (with optional surrounding spaces)

sub(/^[[:space:]]*KB_SERVICE_AUTH_TOKEN[[:space:]]*=/, "", line)

# Trim leading/trailing whitespace

gsub(/^[[:space:]]*/, "", line)

gsub(/[[:space:]]*$/, "", line)

# Strip a single pair of surrounding quotes, if present

if (line ~ /^".*"$/ || line ~ /^'\''.*'\''$/) {

line = substr(line, 2, length(line) - 2)

}

print line

exit

}' .env)

Copilot · 2026-02-13T17:22:45Z

app/utils/workspace.py

+            # Use a session to ensure clean connection handling and bypass proxies
+            with requests.Session() as session:
+                session.trust_env = False  # Critical for avoiding stalls


Setting trust_env=False disables reading proxy settings from environment variables (HTTP_PROXY, HTTPS_PROXY, etc.). While the comment indicates this is intentional to avoid stalls, this could break deployments in environments that require proxy configuration. Consider making this configurable via a setting rather than hardcoding, or document this behavior clearly for deployment scenarios.

Suggested change

# Use a session to ensure clean connection handling and bypass proxies

with requests.Session() as session:

session.trust_env = False # Critical for avoiding stalls

# Use a session to ensure clean connection handling and (by default) bypass proxies

with requests.Session() as session:

# Allow overriding trust_env via environment for proxy-dependent deployments.

# Default remains False to avoid stalls in environments where proxies misbehave.

trust_env_override = os.getenv("WORKSPACE_TRUST_ENV")

if trust_env_override is not None:

session.trust_env = trust_env_override.lower() in ("1", "true", "yes", "on")

else:

session.trust_env = False

Copilot · 2026-02-13T17:22:45Z

app/utils/workspace.py

+    def _workspace_auth_header(self) -> str:
+        """Return Authorization header value.

-        Args:
-            handle_ref: Handle ID (KBH_xxxxx format)
-            target_path: Where to save the file
-
-        Returns:
-            Path to downloaded file
+        KBase workspace API expects just the token string, NOT "Bearer <token>".
+        This method ensures we send the raw token without any prefix.
        """
-        # Ensure directory exists
-        target_path = Path(target_path)
-        target_path.parent.mkdir(parents=True, exist_ok=True)
-
-        if self._use_kbutillib and self._client:
-            try:
-                # Ensure KBUtilLib has the token set
-                if hasattr(self._client, 'save_token'):
-                    self._client.save_token(self.token, namespace="kbase")
-                result = self._client.download_blob_file(handle_ref, str(target_path))
-                if result:
-                    return Path(result)
-            except Exception as e:
-                logger.warning(f"KBUtilLib download_blob_file failed: {e}. Using fallback.", exc_info=True)
-
-        return Path(self._download_blob_fallback(handle_ref, str(target_path)))
-
-    # =========================================================================
-    # FALLBACK METHODS (Direct API calls)
-    # =========================================================================
-
-    def _workspace_auth_header(self) -> str:
-        """Return Authorization header value for KBase Workspace API. Workspace expects Bearer token."""
-        t = self.token or ""
-        if t.startswith("Bearer ") or t.startswith("OAuth "):
-            return t
-        return f"Bearer {t}" if t else ""
+        if not self.token:
+            return ""
+        # Ensure we strip any Bearer prefix that might have been passed incorrectly
+        token = self.token.strip()
+        if token.startswith("Bearer "):
+            token = token[7:].strip()
+        return token


The _workspace_auth_header method strips "Bearer " prefix from tokens (lines 56-58), but this may break authentication if the KBase Workspace API actually expects "Bearer " in some contexts. The comment on line 50 states the API expects just the token, but this should be verified. If both formats are valid in different contexts, the method should preserve the original format or clearly document which format is expected where.

Copilot · 2026-02-13T17:22:51Z

app/utils/workspace.py

-from pathlib import Path
-from typing import Any
+import time
+import shutil


Import of 'shutil' is not used.

Suggested change

import shutil

Copilot · 2026-02-13T17:22:51Z

app/utils/workspace.py

 import requests
-from app.utils.cache import get_upa_cache_path
+from pathlib import Path
+from typing import Any, Optional


Import of 'Optional' is not used.

Suggested change

from typing import Any, Optional

from typing import Any

Copilot · 2026-02-13T17:22:51Z

tests/integration/test_docker_deployment.py

+
+    def test_file_permissions(self):
+        """Test that cache directory has correct permissions."""
+        app = create_app()


Variable app is not used.

Suggested change

app = create_app()

create_app()

Copilot · 2026-02-13T17:22:52Z

scripts/test_production_ready.py

+        print('=' * 70)
+
+        # Test 2: Workspace access
+        pangenome_data = test_workspace_access(upa)


Variable pangenome_data is not used.

Suggested change

pangenome_data = test_workspace_access(upa)

test_workspace_access(upa)

Copilot · 2026-02-13T17:22:52Z

app/services/logger.py

+            level = level.lower()
+            # Filter by level severity logic could be added here, 
+            # for now just exact match or simple inclusion
+            # But usually we want min level.
+            # Let's just return all for clients to filter or implement simple filtering
+            pass


Unnecessary 'pass' statement.

Suggested change

level = level.lower()

# Filter by level severity logic could be added here,

# for now just exact match or simple inclusion

# But usually we want min level.

# Let's just return all for clients to filter or implement simple filtering

pass

# Normalize level; actual filtering can be implemented here later.

level = level.lower()

VibhavSetlur added 4 commits February 13, 2026 10:31

feat: enhance auth handling, logging, and add deployment verification…

00506cd

… tests

chore: remove unused KBUtilLib dependency

9e9b893

feat: finalize KBUtilLib removal and update integration tests

4ad61c6

feat: add memory logging service and system logs endpoint

021855b

VibhavSetlur requested a review from Copilot February 13, 2026 17:15

Copilot started reviewing on behalf of VibhavSetlur February 13, 2026 17:15 View session

Merge branch 'develop' into feature/auth-and-deployment-polish

a64efd7

Copilot AI reviewed Feb 13, 2026

View reviewed changes

feat: improve token parsing and refine logging

c8c06f1

VibhavSetlur merged commit ca07aca into develop Feb 13, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/auth and deployment polish#13

Feature/auth and deployment polish#13
VibhavSetlur merged 6 commits intodevelopfrom
feature/auth-and-deployment-polish

VibhavSetlur commented Feb 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Get token from .env
		TOKEN=$(grep KB_SERVICE_AUTH_TOKEN .env \| cut -d'=' -f2 \| tr -d '"' \| tr -d "'")

-# Get token from .env
-TOKEN=$(grep KB_SERVICE_AUTH_TOKEN .env | cut -d'=' -f2 | tr -d '"' | tr -d "'")
+# Get token from .env (handle spaces and quoted values robustly)
+TOKEN=$(awk '
+/^[[:space:]]*KB_SERVICE_AUTH_TOKEN[[:space:]]*=/ {
+    line = $0
+    # Remove the key and '=' (with optional surrounding spaces)
+    sub(/^[[:space:]]*KB_SERVICE_AUTH_TOKEN[[:space:]]*=/, "", line)
+    # Trim leading/trailing whitespace
+    gsub(/^[[:space:]]*/, "", line)
+    gsub(/[[:space:]]*$/, "", line)
+    # Strip a single pair of surrounding quotes, if present
+    if (line ~ /^".*"$/ || line ~ /^'\''.*'\''$/) {
+        line = substr(line, 2, length(line) - 2)
+    }
+    print line
+    exit
+}' .env)

-            # Use a session to ensure clean connection handling and bypass proxies
-            with requests.Session() as session:
-                session.trust_env = False  # Critical for avoiding stalls
+            # Use a session to ensure clean connection handling and (by default) bypass proxies
+            with requests.Session() as session:
+                # Allow overriding trust_env via environment for proxy-dependent deployments.
+                # Default remains False to avoid stalls in environments where proxies misbehave.
+                trust_env_override = os.getenv("WORKSPACE_TRUST_ENV")
+                if trust_env_override is not None:
+                    session.trust_env = trust_env_override.lower() in ("1", "true", "yes", "on")
+                else:
+                    session.trust_env = False

	pangenome_data = test_workspace_access(upa)
	test_workspace_access(upa)

Conversation

VibhavSetlur commented Feb 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants