feat: add deterministic evaluation report flow by bilalbayram · Pull Request #35 · bilalbayram/opencode-finance

bilalbayram · 2026-03-28T16:15:17Z

Summary

add a deterministic report_evaluation tool and artifact writer for evaluation.md and evaluation-snapshot.json
require those evaluation artifacts in the report PDF flow and render a dedicated page 2 with metric groups and quarterly charts
update report prompts/skill/install-script sync, add focused tests, and bump opencode-finance to 1.2.9-beta.0

Verification

TMPDIR=/tmp bun run typecheck
TMPDIR=/tmp bun run build
bun test src/command/finance.test.ts src/tool/report-pdf.test.ts test/unit/finance-plugin-report-hints.test.ts test/unit/report-skill-contract.test.ts test/unit/report-evaluation-artifacts.test.ts test/unit/openclaw-tool-adapter.test.ts test/unit/install-script-skill-sync.test.ts

Notes

A full bun test run in the temp worktree still shows unrelated environment/pre-existing failures around storage permissions, Node/OpenClaw ESM import, and npm pack timeout.

vercel · 2026-03-28T16:15:22Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
opencode-finance-web-wulr	Ignored	Preview	Mar 30, 2026 1:38pm

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bc490bc19d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T16:21:04Z

+  const snapshotPath = path.join(input.outputRoot, "evaluation-snapshot.json")
+  const markdown = renderReportEvaluationMarkdown(input.snapshot)
+  const snapshotText = JSON.stringify(ReportEvaluationSnapshotSchema.parse(input.snapshot), null, 2)
+  await Promise.all([Bun.write(evaluationPath, markdown), Bun.write(snapshotPath, snapshotText)])


Replace Bun.write with runtime-neutral file writes

writeReportEvaluationArtifacts uses Bun.write, but this package is also executed in Node-based contexts (for example OpenClaw’s CJS plugin runtime), where Bun is undefined; calling report_evaluation there will throw ReferenceError: Bun is not defined before either artifact is written. This should use the existing runtime-neutral file IO path (fs.writeFile/writeText) so report generation works outside Bun.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-28T16:21:04Z

@@ -102,6 +102,6 @@ export const FINANCE_SLASH_COMMANDS: FinanceSlashCommand[] = [
    description: "Generate a comprehensive public-company financial report",
    hints: ["$1", "$2"],
    template:
-      'Build a very comprehensive public-company financial report for "$1".\n\nFirst call the skill tool with name "finance-comprehensive-report".\n\nThen produce:\n1) a concise in-chat executive summary including `directional_conviction_score` (0-100), score band (`bearish`, `neutral`, `bullish`), and top drivers\n2) markdown artifacts in `reports/$1/<YYYY-MM-DD>/`:\n   - `report.md`\n   - `dashboard.md`\n   - `assumptions.json`\n   - `adjustment-log.md`\n\nInside `report.md`, include dedicated sections for technical analysis, fundamental analysis, risk assessment, portfolio fit, market intelligence, scenario valuation, and monitoring triggers.\n\nInside `assumptions.json`, include `scenario_assumptions`, `score_inputs`, `factor_weights`, and `uncertainty_flags`.\n\nUse `financial_search` with `coverage: \"comprehensive\"` for numeric financial claims. Use Exa (`websearch`) only for qualitative news/catalyst context, never for numeric financial metrics.\n\nUse IEEE-style numbered citations: place [N] in-text for each factual claim and collect all references in a final ## Sources section at the end of report.md. Each entry: [N] Publisher/Domain, "title," URL. Retrieved YYYY-MM-DD. Never use generic source labels like `websearch`, `exa`, `search`, or `internet`.\n\nIn `dashboard.md`, include separate KPI rows for Stock price, Previous close, Daily change, Daily change percent, 52W high/low or 52W range, YTD return, Market cap, Analyst consensus, Revenue, Net income, Free cash flow, and Debt-to-equity.\n\nFor insider analysis, call `financial_search` with intent `insider` and `coverage: \"comprehensive\"`, and use returned `ownershipChange` when available.\n\nKeep KPI source labels and source URLs coherent (for example, Yahoo chart label must use Yahoo chart URL).\n\nIn `report.md`, include metadata lines: `Sector: <value>`, `Headquarters: <value>`, `Website: <value>`, and `Icon URL: <value>`.\n\nUse period-aware fundamental labels based on `metricPeriods` (`TTM`, `FY`, `Q`); do not force `(TTM)` when the period differs.\n\nWhen portfolio holdings/benchmark/mandate are missing, render portfolio fit as `Not evaluated: missing portfolio holdings, benchmark, and mandate`.\n\nNever emit the token `N/A` in report artifacts. Use observed values or explicit `unknown`/reason text.\n\nAfter all markdown artifacts are written, ask exactly one user question with the `question` tool:\n- header: `PDF Export`\n- question: `Generate a polished PDF report now?`\n- options:\n  1) `Yes (Recommended)` - Generate a polished PDF in the report directory.\n  2) `No` - Skip PDF generation.\n- custom: `false`\n\nIf user selects `Yes (Recommended)`, call `report_pdf` with:\n- `subcommand`: `report`\n- `outputRoot`: `reports/$1/<YYYY-MM-DD>/`\n- `filename`: `$1-<YYYY-MM-DD>.pdf`\n\nIf `question` is unavailable in this client context, skip PDF export and complete the report normally.\n\nPDF requirements:\n- first page: ticker summary\n- remaining pages: report information from `report.md`, `dashboard.md`, and `assumptions.json` (exclude `adjustment-log.md`)\n- footer on every page must display `opencode.finance` and link to `https://opencode.finance`\n\nInclude source and timestamp for factual claims, label missing values as "unknown", and keep observed data separate from model assumptions and inference. Do not provide investment advice.',
+      'Build a very comprehensive public-company financial report for "$1".\n\nFirst call the skill tool with name "finance-comprehensive-report".\n\nThen produce:\n1) a concise in-chat executive summary including `directional_conviction_score` (0-100), score band (`bearish`, `neutral`, `bullish`), and top drivers\n2) markdown artifacts in `reports/$1/<YYYY-MM-DD>/`:\n   - `report.md`\n   - `dashboard.md`\n   - `assumptions.json`\n   - `adjustment-log.md`\n   - `evaluation.md`\n   - `evaluation-snapshot.json`\n\nInside `report.md`, include dedicated sections for technical analysis, fundamental analysis, risk assessment, portfolio fit, market intelligence, scenario valuation, and monitoring triggers.\n\nInside `assumptions.json`, include `scenario_assumptions`, `score_inputs`, `factor_weights`, and `uncertainty_flags`.\n\nUse `financial_search` with `coverage: \"comprehensive\"` for numeric financial claims. Use Exa (`websearch`) only for qualitative news/catalyst context, never for numeric financial metrics.\n\nUse IEEE-style numbered citations: place [N] in-text for each factual claim and collect all references in a final ## Sources section at the end of report.md. Each entry: [N] Publisher/Domain, "title," URL. Retrieved YYYY-MM-DD. Never use generic source labels like `websearch`, `exa`, `search`, or `internet`.\n\nIn `dashboard.md`, include separate KPI rows for Stock price, Previous close, Daily change, Daily change percent, 52W high/low or 52W range, YTD return, Market cap, Analyst consensus, Revenue, Net income, Free cash flow, and Debt-to-equity.\n\nFor insider analysis, call `financial_search` with intent `insider` and `coverage: \"comprehensive\"`, and use returned `ownershipChange` when available.\n\nKeep KPI source labels and source URLs coherent (for example, Yahoo chart label must use Yahoo chart URL).\n\nIn `report.md`, include metadata lines: `Sector: <value>`, `Headquarters: <value>`, `Website: <value>`, and `Icon URL: <value>`.\n\nUse period-aware fundamental labels based on `metricPeriods` (`TTM`, `FY`, `Q`); do not force `(TTM)` when the period differs.\n\nWhen portfolio holdings/benchmark/mandate are missing, render portfolio fit as `Not evaluated: missing portfolio holdings, benchmark, and mandate`.\n\nNever emit the token `N/A` in report artifacts. Use observed values or explicit `unknown`/reason text.\n\nAfter `report.md`, `dashboard.md`, `assumptions.json`, and `adjustment-log.md` are written, call `report_evaluation` with:\n- `ticker`: `$1`\n- `outputRoot`: `reports/$1/<YYYY-MM-DD>/`\n\nAfter `report_evaluation` writes `evaluation.md` and `evaluation-snapshot.json`, call `report_pdf` with:\n- `subcommand`: `report`\n- `outputRoot`: `reports/$1/<YYYY-MM-DD>/`\n- `filename`: `$1-<YYYY-MM-DD>.pdf`\n\nDo not ask a PDF question and do not skip PDF generation.\n\nPDF requirements:\n- first page: ticker summary\n- second page: deterministic evaluation snapshot from `evaluation-snapshot.json`\n- remaining pages: report information from `report.md`, `dashboard.md`, and `assumptions.json` (exclude `adjustment-log.md`)\n- footer on every page must display `opencode.finance` and link to `https://opencode.finance`\n\nInclude source and timestamp for factual claims, label missing values as "unknown", and keep observed data separate from model assumptions and inference. Do not provide investment advice.',


Don’t require report_evaluation where the tool is unavailable

The /report template now unconditionally requires calling report_evaluation and forbids skipping PDF generation, but OpenClaw v1 currently exposes only financial_search and report_pdf tools, so this flow cannot complete in that environment and report_pdf will block on missing evaluation.md/evaluation-snapshot.json. Add a capability fallback in this template (or expose report_evaluation in OpenClaw) to avoid hard-failing /report there.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9131dab02e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-29T11:22:20Z

+function pickMetricValue(metric: Record<string, unknown>, keys: string[]) {
+  for (const key of keys) {
+    const value = metric[key]
+    const percent = /yield|ratio|margin|growth|roe|roa|roic|return/i.test(key) ? toPercent(value) : toNumber(value)


Stop scaling debt-to-equity as a percent

pickMetricValue treats any key containing ratio as a percentage and runs toPercent, which multiplies values <= 1 by 100. That mis-scales Finnhub’s totalDebtToEquity* fields when they are returned as ratios (for example 0.45 becomes 45), and the downstream Debt to Equity metric is then rendered as 45x instead of 0.45x in evaluation-snapshot.json/PDF outputs. This introduces a material valuation-quality error for low-leverage companies and should exclude debt-to-equity from percent normalization.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6588d52fac

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-29T12:33:20Z

+  const currentRoa = input.currentNetIncome / averageAssets
+  const priorRoa = input.priorNetIncome / input.priorTotalAssets


Compute Piotroski ROA with period-consistent denominators

In calculatePiotroskiFScore, the current-period ROA is divided by averageAssets while the prior-period ROA is divided by priorTotalAssets, so the currentRoa > priorRoa signal compares two differently scaled ratios. When assets change materially year-over-year, this can flip the profitability-improvement point and produce an incorrect F-score in evaluation-snapshot.json/PDF outputs.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-29T12:33:20Z

+  const currentRatio = input.currentCurrentAssets / input.currentCurrentLiabilities
+  const priorRatio = input.priorCurrentAssets / input.priorCurrentLiabilities


Guard Piotroski component ratios from divide-by-zero artifacts

Several Piotroski inputs are only null-checked before division, so zero-valued denominators (for example currentCurrentLiabilities or priorCurrentLiabilities) produce Infinity/NaN and can incorrectly award or suppress score points instead of marking the metric unknown. This is especially problematic when provider payloads contain zeros for incomplete statement fields, because it silently corrupts the deterministic score.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 446d56c6ea

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T12:58:27Z

+      evaluationMarkdown: await readRequired(path.join(root, "evaluation.md")),
+      evaluationSnapshotJson: await readRequired(path.join(root, "evaluation-snapshot.json")),


Keep report_pdf compatible with the packaged report skill

The report profile now hard-requires evaluation.md and evaluation-snapshot.json, but the packaged OpenClaw report skill (packages/opencode-finance/skills/report/SKILL.md) still instructs generating only report.md, dashboard.md, assumptions.json, and adjustment-log.md before calling report_pdf. In that shipped workflow, report_pdf will now fail with missing-artifact errors, so /report cannot complete unless users manually add an extra tool step. Please either update the packaged skill to invoke report_evaluation first or add a compatibility path for legacy report artifacts.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-30T12:58:27Z

+if (!existsSync(distIndex) || !existsSync(distOpenClaw)) {
+  const build = spawnSync("bun", ["run", "build"], {


Rebuild dist unconditionally during prepack

This prepack script now skips bun run build whenever dist/index.js and dist/openclaw/index.cjs already exist, which allows stale compiled output and copied assets to be published after source changes. A maintainer running npm pack/npm publish locally with an existing dist/ can silently ship old code even though the source has changed; prepack should rebuild (or at least verify freshness) every time.

Useful? React with 👍 / 👎.

feat(opencode-finance): add deterministic evaluation report flow

bc490bc

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

bilalbayram added 2 commits March 29, 2026 14:09

chore(release): allow beta publishes off main

6664457

fix(release): honor manual beta dispatches

9131dab

chatgpt-codex-connector bot reviewed Mar 29, 2026

View reviewed changes

feat(opencode-finance): simplify evaluation snapshot for yahoo beta1

6588d52

chatgpt-codex-connector bot reviewed Mar 29, 2026

View reviewed changes

bilalbayram added 2 commits March 30, 2026 15:27

refactor: reorganize opencode-finance source tree

446d56c

refactor: simplify pdf quality gates

6177e35

chatgpt-codex-connector bot reviewed Mar 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add deterministic evaluation report flow#35

feat: add deterministic evaluation report flow#35
bilalbayram wants to merge 6 commits intomainfrom
feat/eval-page

bilalbayram commented Mar 28, 2026

Uh oh!

vercel bot commented Mar 28, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 29, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 29, 2026

Uh oh!

chatgpt-codex-connector bot Mar 29, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		const currentRoa = input.currentNetIncome / averageAssets
		const priorRoa = input.priorNetIncome / input.priorTotalAssets

		const currentRatio = input.currentCurrentAssets / input.currentCurrentLiabilities
		const priorRatio = input.priorCurrentAssets / input.priorCurrentLiabilities

		evaluationMarkdown: await readRequired(path.join(root, "evaluation.md")),
		evaluationSnapshotJson: await readRequired(path.join(root, "evaluation-snapshot.json")),

		if (!existsSync(distIndex) \|\| !existsSync(distOpenClaw)) {
		const build = spawnSync("bun", ["run", "build"], {

Conversation

bilalbayram commented Mar 28, 2026

Summary

Verification

Notes

Uh oh!

vercel bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Mar 28, 2026 •

edited

Loading