The TableScanner service provides read-only access to SQLite databases stored in KBase (via Workspace objects). It supports listing tables, inspecting schemas, querying data with filtering, sorting, and pagination, and multi-database objects.
| Environment | URL |
|---|---|
| AppDev | https://appdev.kbase.us/services/berdl_table_scanner |
| Production | https://kbase.us/services/berdl_table_scanner |
| Local Development | http://localhost:8000 |
Each user must provide their own KBase authentication token. The service does not use a shared/service-level token for production access.
- Header (recommended):
Authorization: <token> - Cookie:
kbase_session=<token>(useful for browser-based clients like DataTables Viewer)
Note for Developers: The
KB_SERVICE_AUTH_TOKENenvironment variable is available as a legacy fallback for local testing only. It should NOT be relied upon in production.
- Gzip Support: Responses >1KB are automatically compressed if the
Accept-Encoding: gzipheader is present. - Fast JSON: All responses use optimized JSON serialization (ORJSON).
Basic service check.
- Response:
{"service": "TableScanner", "version": "1.0.0", "status": "running"}
Detailed health check including connection pool stats.
Example:
curl "https://appdev.kbase.us/services/berdl_table_scanner/health"Access databases via KBase Workspace Object Reference (UPA, e.g., 76990/7/2).
List tables for a BERDLTables object.
curl -H "Authorization: $KB_TOKEN" \
"https://appdev.kbase.us/services/berdl_table_scanner/object/76990/7/2/tables"Response: Table list with schema overviews, row counts, and column counts.
Query table data.
Query Params:
limit(default: 100, max: 500000)offset(default: 0)sort_column,sort_order(ASC/DESC)search(Global text search)
curl -H "Authorization: $KB_TOKEN" \
"https://appdev.kbase.us/services/berdl_table_scanner/object/76990/7/2/tables/Genes/data?limit=100&search=kinase"Get detailed statistics for all columns in a table.
curl -H "Authorization: $KB_TOKEN" \
"https://appdev.kbase.us/services/berdl_table_scanner/object/76990/7/2/tables/Genes/stats"Response: Column statistics including null counts, distinct counts, min/max/mean, and sample values.
For objects containing multiple databases.
Note: All multi-database endpoints use query parameters for the UPA to avoid path parsing issues with slashes in workspace references.
List all databases within an object.
curl -H "Authorization: $KB_TOKEN" \
"https://appdev.kbase.us/services/berdl_table_scanner/databases?upa=76990/7/2"Response:
{
"berdl_table_id": "76990/7/2",
"databases": [
{"db_name": "GCF_000368685.1", "db_display_name": "E. coli", "tables": [...]},
{"db_name": "GCF_004211955.1", "db_display_name": "E. coli O157:H7", "tables": [...]}
],
"has_multiple_databases": true
}List tables in a specific database.
curl -H "Authorization: $KB_TOKEN" \
"https://appdev.kbase.us/services/berdl_table_scanner/db/GCF_000368685.1/tables?upa=76990/Test2"Query data from a specific database.
curl -H "Authorization: $KB_TOKEN" \
"https://appdev.kbase.us/services/berdl_table_scanner/db/GCF_000368685.1/tables/Genes/data?upa=76990/Test2&limit=100"Complex query endpoint supporting advanced filtering and aggregations.
Request Body:
{
"berdl_table_id": "76990/7/2",
"table_name": "Genes",
"limit": 100,
"offset": 0,
"filters": [
{"column": "contigs", "operator": "gt", "value": 50},
{"column": "gene_name", "operator": "like", "value": "kinase"}
],
"sort_column": "gene_length",
"sort_order": "DESC"
}Supported Filter Operators:
| Operator | Description | Example |
|---|---|---|
eq |
Equal | {"column": "status", "operator": "eq", "value": "active"} |
ne |
Not equal | {"column": "status", "operator": "ne", "value": "deleted"} |
gt |
Greater than | {"column": "length", "operator": "gt", "value": 1000} |
gte |
Greater than or equal | {"column": "score", "operator": "gte", "value": 0.95} |
lt |
Less than | {"column": "length", "operator": "lt", "value": 100} |
lte |
Less than or equal | {"column": "score", "operator": "lte", "value": 0.5} |
like |
Contains (case-sensitive) | {"column": "name", "operator": "like", "value": "kinase"} |
ilike |
Contains (case-insensitive) | {"column": "name", "operator": "ilike", "value": "KINASE"} |
in |
Value in list | {"column": "type", "operator": "in", "value": ["CDS", "gene"]} |
not_in |
Value not in list | {"column": "type", "operator": "not_in", "value": ["pseudo"]} |
between |
Range (inclusive) | {"column": "length", "operator": "between", "value": [100, 500]} |
is_null |
Is NULL | {"column": "description", "operator": "is_null"} |
is_not_null |
Is not NULL | {"column": "product", "operator": "is_not_null"} |
curl -X POST -H "Authorization: $KB_TOKEN" \
-H "Content-Type: application/json" \
-d '{"berdl_table_id": "76990/7/2", "table_name": "Genes", "limit": 100}' \
"https://appdev.kbase.us/services/berdl_table_scanner/table-data"Upload a temporary SQLite database file to the server. Useful for testing or serving local files.
Request: Multipart form data with key file
Response:
{
"handle": "local:sha256hash",
"filename": "my_db.db",
"size_bytes": 10240,
"message": "Database uploaded successfully"
}Features:
- Deduplication: Same file uploaded twice returns the same handle (SHA-256 based)
- Max Size: 500MB (configurable)
- Expiration: Uploaded files expire after 1 hour
- Quota: 10GB total upload storage (auto-cleanup when exceeded)
-
Upload File:
curl -X POST "https://appdev.kbase.us/services/berdl_table_scanner/upload" \ -F "file=@/path/to/test.db"
-
Use Handle: The returned
handle(e.g.,local:abc123def456...) can be used as theberdl_table_idorws_refin any other endpoint.- List tables:
GET /object/local:abc123.../tables - Query data:
POST /table-datawith"berdl_table_id": "local:abc123..."
- List tables:
All data query endpoints return a consistent response structure:
{
"headers": ["gene_id", "gene_name", "product"],
"data": [["ACIAD_RS00005", "dnaA", "replication initiator"], ...],
"row_count": 100,
"total_count": 3356,
"filtered_count": 3356,
"table_name": "Genes",
"response_time_ms": 125.5,
"db_query_ms": 42.0,
"cached": false,
"limit": 100,
"offset": 0
}Retrieve recent in-memory logs from the TableScanner service. This is primarily used by frontend tools (like the DataTables Viewer) for debugging and monitoring.
Query Parameters:
limit(optional, default: 100, max: 1000) – number of log entries to returnlevel(optional) – minimum log level (debug,info,warn,error,critical)
Example:
curl "https://appdev.kbase.us/services/berdl_table_scanner/system/logs?limit=50&level=error"Response (simplified):
[
{
"timestamp": "2026-02-13T10:26:17.512994",
"level": "error",
"message": "Error accessing object database 99999/999/999: ...",
"source": "backend",
"logger": "app.utils.workspace"
}
]| Status | Description |
|---|---|
| 400 | Bad Request - Invalid parameters |
| 401 | Unauthorized - Missing or invalid token |
| 404 | Not Found - Object or table not found |
| 413 | Payload Too Large - File exceeds 500MB |
| 422 | Unprocessable Entity - Invalid filter syntax |
| 500 | Internal Server Error |
| 507 | Insufficient Storage - Upload quota exceeded |