diff --git a/CLAUDE.md b/CLAUDE.md index f79a7e4..9682c3f 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -50,7 +50,7 @@ All under `/v1/`: | `GET /search/explore?q=&platform=&page=` | User-triggered deep GitHub search, paginated, ingests into index. Also reads `X-GitHub-Token`. Cold-path latency is 10–30s — clients must use a 30s timeout. | | `GET /categories/{trending\|new-releases\|most-popular}/{android\|windows\|macos\|linux}` | Pre-ranked repo lists. Sort order is `search_score DESC NULLS LAST, rank ASC` — static `rank` is only the tie-breaker once behavioral signals exist. | | `GET /topics/{privacy\|media\|productivity\|networking\|dev-tools}/{platform}` | Topic-bucketed repos. Same dynamic ordering as categories. | -| `GET /repo/{owner}/{name}` | Single repo detail. Curated DB hit on the fast path; on miss, lazy-fetches metadata from GitHub via `GitHubResourceClient` and reads optional `X-GitHub-Token`. Response includes `openIssuesCount` (mirrors GitHub's `open_issues_count`, which counts open issues + open PRs together — same value the GitHub website's Issues tab shows). | +| `GET /repo/{owner}/{name}` | Single repo detail. Curated DB hit on the fast path; on miss, lazy-fetches metadata from GitHub via `GitHubResourceClient` and reads optional `X-GitHub-Token`. Response includes `openIssuesCount` (mirrors GitHub's `open_issues_count`, which counts open issues + open PRs together — same value the GitHub website's Issues tab shows) and `licenseSpdxId` / `licenseName` (GitHub-detected license; null when no LICENSE file or unrecognised). | | `POST /repo/{owner}/{name}/refresh` | User-triggered refetch of a repo's metadata + latest release. Re-fetches from GitHub via `RepoRefreshCoordinator`, upserts Postgres + pushes Meili, returns the same shape as the GET. Per-repo cooldown 30s + global hourly budget 1000 prevent pool-token torch from spam clicks. Reads `X-GitHub-Token`. Response is `Cache-Control: no-store`; the GET path's CDN cache catches up via its own TTL (~5 min on `s-maxage=300`). | | `GET /releases/{owner}/{name}?page=&per_page=` | Proxied list of GitHub releases. Reads optional `X-GitHub-Token`. Cached server-side for 1h. | | `GET /readme/{owner}/{name}` | Proxied README JSON (base64-encoded content + metadata, GitHub's shape). Reads optional `X-GitHub-Token`. Cached 24h. | @@ -62,6 +62,7 @@ All under `/v1/`: | `POST /auth/device/start` | Stateless proxy for `github.com/login/device/code`. Client used to call GitHub directly; some user networks (documented in OpenHub-Store/GitHub-Store#433, #395) can't reach GitHub reliably. Backend adds `client_id`, forwards GitHub's body verbatim. 10 req/hr/IP. | | `POST /auth/device/poll` | Stateless proxy for `github.com/login/oauth/access_token`. Reads `device_code` from form body, adds `client_id` + `grant_type`, forwards GitHub's body verbatim (including tokens on success). The backend never logs, caches, or persists the token. 200 req/hr/IP. | | `GET /internal/metrics` | Operator-only. Gated by `X-Admin-Token` matching the `ADMIN_TOKEN` env var (open if unset, for local dev). Returns per-source search counters, P-latency, worker queue depth, and top 20 misses (8-char `query_hash` prefix only) in last 7 days. | +| `POST /internal/backfill-stale?limit=N` | Operator-only. Spawns a paced background job that refreshes every curated row whose new metadata columns are still at their migration defaults (currently keyed on `license_spdx_id IS NULL`). One concurrent run; returns 409 on re-trigger. Uses `searchClient.refreshRepo` + persist; respects the quiet window so the daily fetcher's pool stays free. Run after a column-add deploy; no-ops afterwards once the filter no longer matches. | | `GET /badge/...` | M3-styled SVG badges. Per-repo: `/badge/{owner}/{name}/{kind}/{style}/{variant}` for kind ∈ {release, stars, downloads}. Global: `/badge/{kind}/{style}/{variant}` for kind ∈ {users, fdroid}. Static: `/badge/static/{style}/{variant}?label=&icon=`. Style 1-12 hue, variant 1-3 shade. Vectorized glyph rendering — no font dependency at SVG embed time. | Client-facing API contract and migration history live in `internal/` (gitignored, operator-only). The client repo at `OpenHub-Store/GitHub-Store` is the public source of truth for client behavior. diff --git a/docs/client/license-info.md b/docs/client/license-info.md new file mode 100644 index 0000000..6142fcb --- /dev/null +++ b/docs/client/license-info.md @@ -0,0 +1,132 @@ +# Client Integration — `licenseSpdxId` / `licenseName` + +**Audience:** client coding agent (KMP / Compose Multiplatform). +**Goal:** surface a repo's license on the details screen using the new `licenseSpdxId` + `licenseName` fields on `RepoResponse`. No new endpoint, no extra fetch — the values ride on the existing GET response. + +--- + +## 1. What changed + +`RepoResponse` now carries two new fields: + +```kotlin +val licenseSpdxId: String? = null, // e.g. "MIT", "GPL-3.0", "Apache-2.0" +val licenseName: String? = null, // e.g. "MIT License", "GNU General Public License v3.0" +``` + +Both nullable — not every repo has a license. Old clients that don't know the fields parse cleanly via `ignoreUnknownKeys = true`. Same back-compat story as every other additive `RepoResponse` field. + +--- + +## 2. What the values mean + +GitHub's `license` object on `/repos/{owner}/{name}`. Backend only persists two fields out of GitHub's full payload: + +- `licenseSpdxId` — the SPDX short tag. Stable, machine-readable, suitable for icon mapping or filter chips. Example: `"MIT"`, `"GPL-3.0"`, `"Apache-2.0"`, `"BSD-3-Clause"`, `"AGPL-3.0"`, `"MPL-2.0"`, `"Unlicense"`. +- `licenseName` — the full human name. Use for tooltips, accessibility labels, "About" sections. Example: `"MIT License"`, `"GNU General Public License v3.0"`. + +### When both are null + +GitHub returns `license: null` if: +- The repo has no `LICENSE` / `LICENSE.txt` / `LICENSE.md` file at the root. +- GitHub's classifier couldn't recognise the file's content (rare; usually means a custom or modified license). +- The repo is private + you don't have access (not a concern here — backend always uses authenticated calls). + +Show "No license" or hide the chip entirely when both are null. **Do NOT** assume "unlicensed" means "free to use" — most popular OSS without a `LICENSE` file is still under default copyright. Do NOT ship UI that implies otherwise. + +### When one is set but the other is null + +Should not happen — backend writes both columns from the same GitHub object atomically. If you see it, it's a row written before V15 deployed and not yet refreshed. Treat as if both are null until refreshed. + +--- + +## 3. Where the fields appear + +Every `RepoResponse`-shaped payload, identical surface to `openIssuesCount`: + +| Endpoint | Behaviour | +|----------|-----------| +| `GET /v1/repo/{owner}/{name}` | DB-hit and lazy-fetch paths both fill it. | +| `POST /v1/repo/{owner}/{name}/refresh` | Fresh from GitHub. | +| `GET /v1/categories/.../...` | DB value. | +| `GET /v1/topics/.../...` | DB value. | +| `GET /v1/search?q=...` | Meilisearch index value. | + +Existing curated rows have `null` license fields until refreshed — backend writes them on: +1. Search-passthrough ingest +2. Refresh button +3. Hourly worker +4. Daily Python fetcher (after fetcher repo is updated) + +--- + +## 4. Display recommendations + +- **Where:** details screen, in the "facts" row alongside language, stars, forks, open issues. Or in an info panel. +- **Chip text:** show `licenseSpdxId` ("MIT", "GPL-3.0"). Short, scannable. +- **Tooltip / long-press:** show `licenseName` ("MIT License"). +- **Tap behaviour:** open `https://github.com/{owner}/{name}/blob/HEAD/LICENSE` in a browser. Almost every licensed repo has a top-level `LICENSE` file. If GitHub redirects (because it's actually `LICENSE.md` or `COPYING`), browsers handle it. +- **Icon:** generic license / scale glyph. Some clients map specific licenses to specific icons (MIT = open lock, GPL = copyleft symbol). Optional polish — `licenseSpdxId` is the key. +- **Color:** don't color-code by permissive vs copyleft vs proprietary. That implies a value judgement and tends to be controversial. Neutral chip styling. +- **Null handling:** hide the chip cleanly. Don't render "Unknown license" — that's misleading. + +--- + +## 5. Pseudo-code + +```kotlin +@Composable +fun LicenseChip(repo: RepoResponse) { + val spdx = repo.licenseSpdxId ?: return // hide when absent + Chip( + leadingIcon = { Icon(Icons.License, contentDescription = null) }, + label = { Text(spdx) }, + modifier = Modifier.semantics { + // Use the full name for accessibility narration. + contentDescription = repo.licenseName ?: "Licensed under $spdx" + }, + onClick = { openInBrowser("https://github.com/${repo.fullName}/blob/HEAD/LICENSE") }, + ) +} +``` + +--- + +## 6. Filter / search use cases (out of scope for this PR but FYI) + +`licenseSpdxId` is now indexed in Meilisearch via the `license_spdx_id` field on the search document. If you want to add "filter by license" to the search screen later, it's already there — call `/v1/search` with a Meilisearch filter expression. Not implementing that here; just noting the data is available. + +Common useful filter sets: +- "Permissive only": `MIT`, `Apache-2.0`, `BSD-2-Clause`, `BSD-3-Clause`, `MPL-2.0`, `Unlicense`, `0BSD`, `ISC`. +- "Copyleft only": `GPL-2.0`, `GPL-3.0`, `AGPL-3.0`, `LGPL-2.1`, `LGPL-3.0`. +- "Permissive or copyleft (anything but proprietary)": null exclusion + filter list. + +--- + +## 7. What you do NOT need to do + +- **No separate license fetch.** Don't call `/repos/{o}/{n}/license` against GitHub or any equivalent backend route — the value is on the repo response. +- **No license-text rendering.** We don't ship the full LICENSE text in the response (it can be hundreds of lines + GitHub already does this beautifully on their site). Tap the chip to open GitHub. +- **No license validation client-side.** Don't try to verify the SPDX ID against a list — backend trusts whatever GitHub returns. New SPDX tags appear over time; whitelisting client-side would create silent breakage. + +--- + +## 8. Acceptance criteria + +- [ ] `RepoResponse` deserializes with `licenseSpdxId` + `licenseName` on every call site. +- [ ] Details screen renders a license chip when `licenseSpdxId != null`, hides cleanly otherwise. +- [ ] Chip tap opens the LICENSE file on GitHub in an external browser. +- [ ] Tooltip / accessibility label uses `licenseName` when available. +- [ ] No crash when the field is absent (older server response during rollout). + +--- + +## 9. Authoritative reference + +Backend definitions: +- `model/RepoResponse.kt` — `licenseSpdxId` + `licenseName` fields. +- `db/migration/V15__license_info.sql` — the columns. +- `ingest/GitHubSearchClient.kt` — `GitHubLicense` DTO + ingest writes. +- `routes/RepoRoutes.kt`, `routes/SearchRoutes.kt`, `db/RepoRepository.kt` — mappers. + +If client and server disagree, backend wins; file an issue on the backend repo. diff --git a/src/main/kotlin/zed/rainxch/githubstore/db/DatabaseFactory.kt b/src/main/kotlin/zed/rainxch/githubstore/db/DatabaseFactory.kt index a9d90af..dfec530 100644 --- a/src/main/kotlin/zed/rainxch/githubstore/db/DatabaseFactory.kt +++ b/src/main/kotlin/zed/rainxch/githubstore/db/DatabaseFactory.kt @@ -76,6 +76,7 @@ object DatabaseFactory { // only for V13 to drop it seconds later. "V13__drop_telemetry_events.sql", "V14__open_issues_count.sql", + "V15__license_info.sql", ) for (migration in migrations) { val rawSql = this::class.java.classLoader diff --git a/src/main/kotlin/zed/rainxch/githubstore/db/MeilisearchClient.kt b/src/main/kotlin/zed/rainxch/githubstore/db/MeilisearchClient.kt index d6f985b..1101474 100644 --- a/src/main/kotlin/zed/rainxch/githubstore/db/MeilisearchClient.kt +++ b/src/main/kotlin/zed/rainxch/githubstore/db/MeilisearchClient.kt @@ -128,6 +128,8 @@ data class MeiliRepoHit( val stars: Int = 0, val forks: Int = 0, val open_issues: Int = 0, + val license_spdx_id: String? = null, + val license_name: String? = null, val language: String? = null, val latest_release_date: String? = null, val latest_release_tag: String? = null, diff --git a/src/main/kotlin/zed/rainxch/githubstore/db/RepoRepository.kt b/src/main/kotlin/zed/rainxch/githubstore/db/RepoRepository.kt index ebc5368..393a1fd 100644 --- a/src/main/kotlin/zed/rainxch/githubstore/db/RepoRepository.kt +++ b/src/main/kotlin/zed/rainxch/githubstore/db/RepoRepository.kt @@ -73,6 +73,8 @@ class RepoRepository { stargazersCount = this[Repos.stars], forksCount = this[Repos.forks], openIssuesCount = this[Repos.openIssues], + licenseSpdxId = this[Repos.licenseSpdxId], + licenseName = this[Repos.licenseName], language = this[Repos.language], topics = this[Repos.topics], releasesUrl = "${this[Repos.htmlUrl]}/releases", diff --git a/src/main/kotlin/zed/rainxch/githubstore/db/Tables.kt b/src/main/kotlin/zed/rainxch/githubstore/db/Tables.kt index adaba47..04053ce 100644 --- a/src/main/kotlin/zed/rainxch/githubstore/db/Tables.kt +++ b/src/main/kotlin/zed/rainxch/githubstore/db/Tables.kt @@ -17,6 +17,8 @@ object Repos : Table("repos") { val stars = integer("stars").default(0) val forks = integer("forks").default(0) val openIssues = integer("open_issues").default(0) + val licenseSpdxId = text("license_spdx_id").nullable() + val licenseName = text("license_name").nullable() val language = text("language").nullable() val topics = array("topics", TextColumnType()) val latestReleaseDate = timestampWithTimeZone("latest_release_date").nullable() diff --git a/src/main/kotlin/zed/rainxch/githubstore/ingest/GitHubSearchClient.kt b/src/main/kotlin/zed/rainxch/githubstore/ingest/GitHubSearchClient.kt index 486049c..b390892 100644 --- a/src/main/kotlin/zed/rainxch/githubstore/ingest/GitHubSearchClient.kt +++ b/src/main/kotlin/zed/rainxch/githubstore/ingest/GitHubSearchClient.kt @@ -521,6 +521,8 @@ class GitHubSearchClient( it[stars] = repo.stargazersCount it[forks] = repo.forksCount it[openIssues] = repo.openIssuesCount + it[licenseSpdxId] = repo.license?.spdxId + it[licenseName] = repo.license?.name it[language] = repo.language it[topics] = repo.topics it[latestReleaseDate] = releaseDate @@ -557,6 +559,8 @@ class GitHubSearchClient( stars = r.repo.stargazersCount, forks = r.repo.forksCount, open_issues = r.repo.openIssuesCount, + license_spdx_id = r.repo.license?.spdxId, + license_name = r.repo.license?.name, language = r.repo.language, topics = r.repo.topics, latest_release_date = r.release.publishedAt, @@ -604,6 +608,8 @@ class GitHubSearchClient( stargazersCount = repo.stargazersCount, forksCount = repo.forksCount, openIssuesCount = repo.openIssuesCount, + licenseSpdxId = repo.license?.spdxId, + licenseName = repo.license?.name, language = repo.language, topics = repo.topics, releasesUrl = "${repo.htmlUrl}/releases", @@ -660,6 +666,9 @@ data class GitHubRepo( // Includes open PRs (GitHub treats PRs as issues). Same number GitHub // website's Issues tab shows. @SerialName("open_issues_count") val openIssuesCount: Int = 0, + // GitHub-detected license. Null on unlicensed repos or when GitHub's + // classifier didn't recognise the LICENSE file. + val license: GitHubLicense? = null, val language: String? = null, val topics: List = emptyList(), val archived: Boolean = false, @@ -689,3 +698,11 @@ data class GitHubAsset( val size: Long = 0, @SerialName("download_count") val downloadCount: Long = 0, ) + +// GitHub's license object on /repos/{o}/{n}. We persist `spdx_id` + `name` +// only; the upstream `key`, `url`, and `node_id` aren't surfaced. +@Serializable +data class GitHubLicense( + @SerialName("spdx_id") val spdxId: String? = null, + val name: String? = null, +) diff --git a/src/main/kotlin/zed/rainxch/githubstore/model/RepoResponse.kt b/src/main/kotlin/zed/rainxch/githubstore/model/RepoResponse.kt index 04ccfd5..af6e288 100644 --- a/src/main/kotlin/zed/rainxch/githubstore/model/RepoResponse.kt +++ b/src/main/kotlin/zed/rainxch/githubstore/model/RepoResponse.kt @@ -23,6 +23,12 @@ data class RepoResponse( // open PRs (GitHub treats PRs as a kind of issue). Same value as the // GitHub website's Issues tab badge. val openIssuesCount: Int = 0, + // GitHub-detected license. Null when the repo has no LICENSE file or + // when GitHub couldn't classify it. spdxId is the short tag for chip + // display ("MIT", "GPL-3.0", "Apache-2.0"); name is the human-readable + // version ("MIT License"). + val licenseSpdxId: String? = null, + val licenseName: String? = null, val language: String?, val topics: List, val releasesUrl: String?, diff --git a/src/main/kotlin/zed/rainxch/githubstore/routes/InternalRoutes.kt b/src/main/kotlin/zed/rainxch/githubstore/routes/InternalRoutes.kt index 59e28b4..e23e75f 100644 --- a/src/main/kotlin/zed/rainxch/githubstore/routes/InternalRoutes.kt +++ b/src/main/kotlin/zed/rainxch/githubstore/routes/InternalRoutes.kt @@ -2,23 +2,51 @@ package zed.rainxch.githubstore.routes import io.ktor.http.* import io.ktor.server.auth.* +import io.ktor.server.request.* import io.ktor.server.response.* import io.ktor.server.routing.* +import kotlinx.coroutines.CoroutineScope import kotlinx.coroutines.Dispatchers +import kotlinx.coroutines.SupervisorJob import kotlinx.coroutines.async import kotlinx.coroutines.coroutineScope +import kotlinx.coroutines.delay +import kotlinx.coroutines.launch import kotlinx.serialization.Serializable +import org.jetbrains.exposed.sql.SqlExpressionBuilder.isNull +import org.jetbrains.exposed.sql.selectAll import org.jetbrains.exposed.sql.transactions.TransactionManager import org.jetbrains.exposed.sql.transactions.experimental.newSuspendedTransaction +import org.jetbrains.exposed.sql.transactions.transaction +import org.slf4j.LoggerFactory +import zed.rainxch.githubstore.db.Repos +import zed.rainxch.githubstore.ingest.GitHubSearchClient import zed.rainxch.githubstore.ingest.WorkerSupervisor import zed.rainxch.githubstore.metrics.SearchMetricsRegistry import java.nio.charset.StandardCharsets import java.security.MessageDigest +import java.util.concurrent.atomic.AtomicBoolean private const val BASIC_AUTH_REALM = "github-store-admin" const val ADMIN_BASIC_AUTH = "admin-basic" -fun Route.internalRoutes(metrics: SearchMetricsRegistry, workerSupervisor: WorkerSupervisor) { +private val internalLog = LoggerFactory.getLogger("InternalRoutes") + +// Single shared scope for admin-triggered background jobs (currently just +// the metadata-backfill endpoint). SupervisorJob so one job's failure does +// not cancel others. IO dispatcher because the work is HTTP + JDBC. +private val backfillScope = CoroutineScope(Dispatchers.IO + SupervisorJob()) + +// One backfill at a time. Concurrent re-entry would double-count budget + +// race upserts. atomic CAS lets the first call claim, returning 409 to +// concurrent triggers. +private val backfillRunning = AtomicBoolean(false) + +fun Route.internalRoutes( + metrics: SearchMetricsRegistry, + workerSupervisor: WorkerSupervisor, + searchClient: GitHubSearchClient, +) { val adminToken: String? = System.getenv("ADMIN_TOKEN")?.takeIf { it.isNotBlank() } val isProduction = System.getenv("APP_ENV") == "production" @@ -74,6 +102,60 @@ fun Route.internalRoutes(metrics: SearchMetricsRegistry, workerSupervisor: Worke } } + // One-shot metadata backfill: refresh every curated row whose new + // columns (open_issues, license_*) are still at their migration + // defaults because no upsert has touched them since V14/V15 + // landed. Run by an operator after a column-add deploy; no-ops + // afterwards since the SQL filter no longer matches. + // + // Pacing: 500ms per repo (REPO_REFRESH_PACE_MS env honoured for + // consistency with RepoRefreshWorker). Quiet-window respected to + // keep the rotation pool free for the daily fetcher. Single + // concurrent run -- subsequent triggers get 409 until the + // current job finishes. + post("/backfill-stale") { + if (!authorized(call, adminToken)) { + return@post call.respond(HttpStatusCode.NotFound, mapOf("error" to "Not found")) + } + val limit = call.request.queryParameters["limit"] + ?.toIntOrNull() + ?.coerceIn(1, 10_000) + ?: 5_000 + if (!backfillRunning.compareAndSet(false, true)) { + call.response.header(HttpHeaders.RetryAfter, "60") + return@post call.respond( + HttpStatusCode.Conflict, + mapOf("error" to "backfill_already_running"), + ) + } + val candidates = transaction { + Repos.selectAll() + .where { Repos.licenseSpdxId.isNull() } + .orderBy(Repos.id) + .limit(limit) + .map { it[Repos.id] to it[Repos.fullName] } + } + if (candidates.isEmpty()) { + backfillRunning.set(false) + return@post call.respond( + HttpStatusCode.OK, + mapOf("scheduled" to 0, "message" to "no stale rows"), + ) + } + backfillScope.launch { + try { + runBackfill(searchClient, candidates) + } finally { + backfillRunning.set(false) + } + } + call.response.header(HttpHeaders.CacheControl, "no-store") + call.respond( + HttpStatusCode.Accepted, + mapOf("scheduled" to candidates.size, "started" to true), + ) + } + // Browser dashboard. Basic Auth required in prod so the browser prompts // for credentials on first visit; optional in dev for local inspection. authenticate(ADMIN_BASIC_AUTH, optional = adminToken == null) { @@ -108,6 +190,49 @@ private fun authorized(call: io.ktor.server.application.ApplicationCall, adminTo return principal != null } +// One-shot backfill loop. Re-uses GitHubSearchClient.refreshRepo + persist +// (the same path RepoRefreshWorker runs nightly), but drops the curated-row +// exclusion so we hit the catalog rows the worker leaves alone. Pacing +// mirrors REPO_REFRESH_PACE_MS so an operator who tuned the worker also +// tunes this. Quiet window respected -- the rotation pool belongs to the +// daily fetcher between 1-4 UTC. +private suspend fun runBackfill( + searchClient: GitHubSearchClient, + candidates: List>, +) { + val pacePerRepoMs: Long = (System.getenv("REPO_REFRESH_PACE_MS")?.toLongOrNull() ?: 500L) + .coerceAtLeast(0L) + var ok = 0 + var gone = 0 + var archived = 0 + var stale = 0 + var failed = 0 + for ((_, fullName) in candidates) { + // Quiet-window guard: pause the loop, don't burn the candidate. + // The daily fetcher's pool stays free; we resume after the window. + while (searchClient.isQuietWindowNow()) { + delay(60_000) + } + when (val result = searchClient.refreshRepo(fullName)) { + is GitHubSearchClient.RefreshResult.Ok -> { + searchClient.persist(result.repo) + ok++ + } + is GitHubSearchClient.RefreshResult.NoUsableRelease -> { + stale++ + } + GitHubSearchClient.RefreshResult.Gone -> gone++ + GitHubSearchClient.RefreshResult.Archived -> archived++ + GitHubSearchClient.RefreshResult.TransientFailure -> failed++ + } + delay(pacePerRepoMs) + } + internalLog.info( + "Backfill done: ok={} gone={} archived={} no-release={} transient-fail={} (of {})", + ok, gone, archived, stale, failed, candidates.size, + ) +} + private suspend fun fetchDbMetrics(): TrainingMetrics = coroutineScope { val unprocessed = async { countUnprocessedMisses() } val reposWithSignals = async { countReposWithSignals() } diff --git a/src/main/kotlin/zed/rainxch/githubstore/routes/RepoRoutes.kt b/src/main/kotlin/zed/rainxch/githubstore/routes/RepoRoutes.kt index 979bd0b..d55d66b 100644 --- a/src/main/kotlin/zed/rainxch/githubstore/routes/RepoRoutes.kt +++ b/src/main/kotlin/zed/rainxch/githubstore/routes/RepoRoutes.kt @@ -106,6 +106,8 @@ internal fun GitHubRepo.toMetadataOnlyResponse(): RepoResponse = RepoResponse( stargazersCount = stargazersCount, forksCount = forksCount, openIssuesCount = openIssuesCount, + licenseSpdxId = license?.spdxId, + licenseName = license?.name, language = language, topics = topics, releasesUrl = "$htmlUrl/releases", diff --git a/src/main/kotlin/zed/rainxch/githubstore/routes/Routing.kt b/src/main/kotlin/zed/rainxch/githubstore/routes/Routing.kt index 1fb4c0f..d90ce45 100644 --- a/src/main/kotlin/zed/rainxch/githubstore/routes/Routing.kt +++ b/src/main/kotlin/zed/rainxch/githubstore/routes/Routing.kt @@ -58,7 +58,7 @@ fun Application.configureRouting() { repoRefreshRoutes(repoRefreshCoordinator, repoRepository) } authRoutes(deviceClient) - internalRoutes(searchMetrics, workerSupervisor) + internalRoutes(searchMetrics, workerSupervisor, githubSearchClient) rateLimit(RateLimitName("signing-seeds")) { signingSeedsRoutes(signingFingerprintRepository) } diff --git a/src/main/kotlin/zed/rainxch/githubstore/routes/SearchRoutes.kt b/src/main/kotlin/zed/rainxch/githubstore/routes/SearchRoutes.kt index eb676e0..cbba5cf 100644 --- a/src/main/kotlin/zed/rainxch/githubstore/routes/SearchRoutes.kt +++ b/src/main/kotlin/zed/rainxch/githubstore/routes/SearchRoutes.kt @@ -177,6 +177,8 @@ private fun zed.rainxch.githubstore.db.MeiliRepoHit.toRepoResponse() = RepoRespo stargazersCount = stars, forksCount = forks, openIssuesCount = open_issues, + licenseSpdxId = license_spdx_id, + licenseName = license_name, language = language, topics = topics, releasesUrl = "$html_url/releases", diff --git a/src/main/resources/db/migration/V15__license_info.sql b/src/main/resources/db/migration/V15__license_info.sql new file mode 100644 index 0000000..c23c937 --- /dev/null +++ b/src/main/resources/db/migration/V15__license_info.sql @@ -0,0 +1,25 @@ +-- V15: track license info on the repos table so the details screen can +-- surface "MIT License" / "GPL-3.0" / etc. without an extra GitHub +-- passthrough call. Both columns nullable -- not every repo has a +-- detected license. +-- +-- Two columns instead of one JSONB: +-- * license_spdx_id -- short tag ("MIT", "GPL-3.0", "Apache-2.0") +-- * license_name -- full name ("MIT License") +-- +-- We skip GitHub's `key` and `url` fields -- the client renders by +-- spdx_id and links to the GitHub repo's LICENSE file directly. +-- +-- Backend writes the columns on: +-- * search passthrough ingest +-- * POST /v1/repo/{owner}/{name}/refresh +-- * RepoRefreshWorker hourly cycle +-- * Python fetcher daily run (once that repo wires the field through) +-- +-- Idempotent: ADD COLUMN IF NOT EXISTS handles re-runs. + +ALTER TABLE repos + ADD COLUMN IF NOT EXISTS license_spdx_id TEXT; + +ALTER TABLE repos + ADD COLUMN IF NOT EXISTS license_name TEXT;