Boltz Swap/VHTLC pkg’s added#113
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds Boltz chain-swap and Lightning swap support (REST + WS), MuSig2 signing and batch-session handlers, VHTLC/Taproot utilities, chain-swap orchestration/validation, PSBT/tapscript tooling, explorer integration, extensive infra and e2e tests, and CI/Makefile/tooling updates. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant SwapHandler
participant BoltzAPI as Boltz REST
participant BoltzWS as Boltz WS
participant Ark as Ark Client
participant BTC as Bitcoin Network
participant Explorer as ExplorerClient
User->>SwapHandler: ChainSwapArkToBtc(amount, btcAddr)
SwapHandler->>BoltzAPI: CreateChainSwap(request)
BoltzAPI-->>SwapHandler: swapResp (swapId, swapTree, lockup)
SwapHandler->>Ark: SendOffChain(vhtlcAddr)
Ark-->>SwapHandler: userLockTxid
SwapHandler->>BoltzWS: Subscribe(swapId)
BoltzWS-->>SwapHandler: swap.created
BoltzWS-->>SwapHandler: transaction.server.mempool
SwapHandler->>SwapHandler: claimBtcLockup() (MuSig flow)
SwapHandler->>BoltzAPI: GetChainSwapClaimDetails()
BoltzAPI-->>SwapHandler: pubNonce, serverPartial
SwapHandler->>SwapHandler: Combine partials -> finalSig
SwapHandler->>BTC: Broadcast(claimTx)
BTC-->>Explorer: tx confirmed
Explorer-->>SwapHandler: confirmation
SwapHandler->>Ark: Settle(ctx)
Ark-->>SwapHandler: settledTxid
User-->>SwapHandler: ChainSwapStatus = Claimed
sequenceDiagram
participant User
participant SwapHandler
participant BoltzAPI
participant BoltzWS
participant Ark
User->>SwapHandler: PayInvoice(invoice)
SwapHandler->>BoltzAPI: CreateSwap(submarine)
BoltzAPI-->>SwapHandler: boltInvoice
SwapHandler->>Ark: SendOffChain(vhtlcAddr)
BoltzWS-->>SwapHandler: invoice.settled
SwapHandler->>SwapHandler: ClaimVHTLC(preimage)
SwapHandler->>Ark: Submit PSBT/checkpoint, verify sigs
Ark-->>SwapHandler: txid
User-->>SwapHandler: SwapStatus = Success
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
|
🔍 Arkana PR Review —
|
There was a problem hiding this comment.
Actionable comments posted: 10
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
🟠 Major comments (26)
test/infra/mockboltz/server.go-589-608 (1)
589-608:⚠️ Potential issue | 🟠 MajorPotential race condition:
client.subsmap modified without proper synchronization.The subscription map
client.subsis modified at lines 596-598 while onlyclient.muis used forWriteJSON. However,pushSwapUpdate(lines 945-948) readsclient.subsunderwsMu.RLockwithout holdingclient.mu, which could race with concurrent subscribe operations.🔒 Proposed fix: Use client.mu consistently for subs access
if msg.Op == "subscribe" && msg.Channel == "swap.update" { s.wsMu.RLock() client := s.wsClients[conn] s.wsMu.RUnlock() if client == nil { continue } + client.mu.Lock() for _, id := range msg.Args { client.subs[id] = struct{}{} } + client.mu.Unlock() ack := map[string]any{And in
pushSwapUpdate:for conn, client := range s.wsClients { + client.mu.Lock() - if _, ok := client.subs[id]; !ok { + _, subscribed := client.subs[id] + client.mu.Unlock() + if !subscribed { continue } - client.mu.Lock() + client.mu.Lock() err := conn.WriteJSON(payload) client.mu.Unlock()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/mockboltz/server.go` around lines 589 - 608, The subscribe handler modifies client.subs without holding client.mu, which races with pushSwapUpdate reading subs; update the subscribe branch to acquire client.mu before modifying client.subs (and while writing the ack if desired) and release afterwards, and modify pushSwapUpdate to acquire client.mu when accessing/iterating client.subs (while still using s.wsMu.RLock to access s.wsClients if needed) so all reads/writes to client.subs are protected by client.mu; refer to client.subs, client.mu, the subscribe handler (conn.WriteJSON/ack block) and pushSwapUpdate to implement the changes.test/e2e/swap_test.go-48-53 (1)
48-53:⚠️ Potential issue | 🟠 MajorSynchronize
eventsaccess in the async callbacks.
eventCallbackappends toeventsfrom handler goroutines while the test goroutine later reads the same slice without locking. Sinceintegrationtestruns with-race, this pattern can fail the suite. A mutex or channel-backed collector is needed here, and in the repeated callback blocks below.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/swap_test.go` around lines 48 - 53, The test currently appends to the shared slice "events" from the asynchronous "eventCallback" goroutines (type swap.ChainSwapEvent) without synchronization; replace unsynchronized appends with a thread-safe collector—either protect "events" with a sync.Mutex (lock/unlock around reads/writes in eventCallback and any test reads) or switch to a channel-backed collector that the test goroutine consumes and stores into "events"—and update all callback occurrences that append to "events" to use the chosen synchronization mechanism so the race reported by -race is eliminated.swap/btc_util.go-213-236 (1)
213-236:⚠️ Potential issue | 🟠 MajorUse the swap tree's actual leaf versions when building the control block.
computeSwapTreeMerkleRoothashesswapTree.ClaimLeaf.VersionandswapTree.RefundLeaf.Version, but this helper recomputes the sibling hash withNewBaseTapLeafand hardcodesBaseLeafVersionin the control-block header. If Boltz returns a non-base leaf version, the control block will not match the merkle root above.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/btc_util.go` around lines 213 - 236, The control-block uses a hardcoded txscript.BaseLeafVersion and rebuilds siblingLeaf with NewBaseTapLeaf, which can mismatch computeSwapTreeMerkleRoot that uses swapTree.ClaimLeaf.Version and swapTree.RefundLeaf.Version; update the logic that constructs siblingLeaf and leafVersionByte so the leaf version comes from the corresponding swapTree leaf (e.g., swapTree.ClaimLeaf.Version or swapTree.RefundLeaf.Version) instead of txscript.BaseLeafVersion, and ensure the sibling hash (siblingLeaf.TapHash()) is computed from a TapLeaf built with that same version so the controlBlock (the bytes appended to controlBlock and the leafVersionByte) matches the merkle root computed by computeSwapTreeMerkleRoot.swap/utils.go-188-194 (1)
188-194:⚠️ Potential issue | 🟠 Major
>= 512is not a safe seconds-vs-block discriminator.This converts every block delay of 512+ into
LocktimeTypeSecond, so common CSV values like 2016 blocks will round-trip incorrectly. Iflocktimeis BIP68-encoded, inspect the type bit; if it is already normalized, the original type has to be preserved instead of inferred from the magnitude.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/utils.go` around lines 188 - 194, parseLocktime currently infers seconds vs blocks by magnitude (>=512) which misclassifies many BIP68-encoded values; update parseLocktime to inspect the BIP68 "type" bit instead of using a numeric threshold: detect the BIP68 type bit on the incoming uint32 and return arklib.RelativeLocktime with Type set to arklib.LocktimeTypeSecond or arklib.LocktimeTypeBlock based on that bit, and use the remaining encoded value as Value; if the input is already a normalized relative locktime representation, preserve its original type rather than re-inferring from magnitude. Ensure you reference and use parseLocktime, arklib.RelativeLocktime, LocktimeTypeSecond and LocktimeTypeBlock when making the change.swap/utils.go-31-33 (1)
31-33:⚠️ Potential issue | 🟠 MajorDon't ignore checkpoint tapscript decode errors.
A malformed
cfg.CheckpointTapscriptcurrently returns an empty or partial byte slice and pushes the real failure into later signing/verification code. Fail fast here instead.💡 Fail fast on bad config
-func checkpointExitScript(cfg clientTypes.Config) []byte { - buf, _ := hex.DecodeString(cfg.CheckpointTapscript) - return buf +func checkpointExitScript(cfg clientTypes.Config) ([]byte, error) { + buf, err := hex.DecodeString(cfg.CheckpointTapscript) + if err != nil { + return nil, fmt.Errorf("invalid checkpoint tapscript: %w", err) + } + return buf, nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/utils.go` around lines 31 - 33, The function checkpointExitScript currently ignores hex.DecodeString errors which hides malformed cfg.CheckpointTapscript; change checkpointExitScript to return ([]byte, error) instead of just []byte, check the error from hex.DecodeString(cfg.CheckpointTapscript) and return it immediately on failure, and update all callers to handle the error (propagate or handle appropriately) so configuration decode failures fail fast and don't surface later during signing/verification.Makefile-17-20 (1)
17-20:⚠️ Potential issue | 🟠 MajorScope the catch-all rule to the bump commands only.
The global
%:fallback makes any misspelledmaketarget succeed as a no-op, which can silently skip CI or local automation. Keep the dummy target generation limited to the positionalCOMMITargument instead of installing a repo-wide catch-all.💡 Safer pattern
-# Allow positional commit arguments, e.g.: -# make bump-client-lib <git-sha-or-tag> -%: - @: +# Allow positional commit arguments only for the bump targets. +ifneq ($(filter bump-client-lib bump-ark-lib,$(firstword $(MAKECMDGOALS))),) +ifneq ($(COMMIT),) +$(eval $(COMMIT):;@:) +endif +endif🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 17 - 20, Remove the global catch-all rule `%:` and replace it with a scoped pattern for the bump commands only (e.g., use a `bump-%:` pattern target or explicit `bump-client-lib:` target) so misspelled make targets no longer silently succeed; ensure the new target accepts the positional COMMIT variable (the existing usage like `make bump-client-lib <git-sha-or-tag>` still works) and delete the original `%:` block from the Makefile.go.mod-3-3 (1)
3-3:⚠️ Potential issue | 🟠 MajorAlign the module
goversion with CI before merging.The workflows are still pinned to
go 1.26.0while this PR updates go.mod to1.26.1. This mismatch will causego mod tidyto behave differently in CI than locally, breaking builds. Either update.github/workflows/unit.yamland.github/workflows/integration.yamlto usego-version: '1.26.1', or revert the module directive until runners are updated.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@go.mod` at line 3, go.mod was bumped to "go 1.26.1" but CI workflows are still pinned to 1.26.0, causing inconsistent go mod behavior; either update the CI workflow files named unit.yaml and integration.yaml to set go-version: '1.26.1' so CI matches go.mod, or revert the module directive in go.mod back to "go 1.26.0" to match current runners — pick one approach and apply it consistently across go.mod and the two workflow files.swap/utils.go-269-285 (1)
269-285:⚠️ Potential issue | 🟠 MajorHonor context cancellation during the backoff sleep.
After
fnreturns(false, nil), the loop blocks ontime.After(interval)and will not observectx.Done()until the entire interval elapses. That keeps retries alive past cancellation/timeouts.💡 Context-aware sleep
- <-time.After(interval) + timer := time.NewTimer(interval) + select { + case <-ctx.Done(): + if !timer.Stop() { + <-timer.C + } + if errors.Is(ctx.Err(), context.DeadlineExceeded) { + return fmt.Errorf("timed out") + } + return ctx.Err() + case <-timer.C: + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/utils.go` around lines 269 - 285, The retry loop currently blocks on <-time.After(interval) and doesn't observe ctx cancellation during the sleep; change the sleep to be context-aware by selecting between a timer (or time.After channel) and ctx.Done() so that when fn returns (false, nil) the code waits for either the interval or context cancellation/timeout, returning ctx.Err() if cancelled; update the loop around fn(ctx) and the sleep (the select after the fn call that currently uses time.After) to use select { case <-time.After(interval): continue; case <-ctx.Done(): return ctx.Err() } or an equivalent timer-based select to avoid extra retries after cancellation.swap/bolt12.go-115-118 (1)
115-118:⚠️ Potential issue | 🟠 MajorReject non-whole-satoshi amounts instead of truncating them.
Both paths currently do
amount / 1000, which silently floors. For example,1500becomes1sat. That undercounts the requested amount instead of surfacing that it cannot be represented exactly in sats.Suggested fix
- invoiceData.AmountInSats, err = safecast.ToUint64(invoiceData.Amount / 1000) + if invoiceData.Amount%1000 != 0 { + return nil, errors.New("invoice amount must be a whole number of satoshis") + } + invoiceData.AmountInSats, err = safecast.ToUint64(invoiceData.Amount / 1000) @@ - offerData.AmountInSats, err = safecast.ToUint64(offerData.Amount / 1000) + if offerData.Amount%1000 != 0 { + return nil, errors.New("offer amount must be a whole number of satoshis") + } + offerData.AmountInSats, err = safecast.ToUint64(offerData.Amount / 1000)Also applies to: 167-170
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/bolt12.go` around lines 115 - 118, The code currently computes sats by doing invoiceData.Amount / 1000 which silently truncates fractional sats; instead, detect and reject non-whole-satoshi amounts by checking that invoiceData.Amount % 1000 == 0 before converting, and return a clear error if there is a remainder. Replace the two spots that use invoiceData.Amount / 1000 (the block that sets invoiceData.AmountInSats and the similar code at the later occurrence) with logic that verifies exact divisibility, then compute safecast.ToUint64(invoiceData.Amount/1000) only after the divisibility check succeeds (referencing invoiceData.Amount and invoiceData.AmountInSats to locate the changes).swap/chainswap_resume.go-133-171 (1)
133-171:⚠️ Potential issue | 🟠 MajorDo not tie resumed swap monitoring to the caller's context.
ResumeChainSwapstarts background goroutines, but both inherit the resume call'sctx. If the caller uses a timeout or request-scoped context, the resumed swap can stop monitoring immediately after this method returns and miss claim/refund transitions. Use a handler-owned lifecycle context instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/chainswap_resume.go` around lines 133 - 171, The background goroutines in ResumeChainSwap are using the caller's ctx, which can be cancelled by the caller and prematurely stop monitoring; change them to use a handler-owned lifecycle context (e.g., h.lifecycleCtx or h.backgroundCtx) instead of the passed-in ctx/params ctx so resumed swaps run independent of request-scoped timeouts. Update the two goroutine invocations that call monitorAndClaimArkToBtcSwap(...) and monitorBtcToArkChainSwap(...) to pass the handler-owned context, and if the handler does not yet have such a context add one to the handler (and ensure it is cancelled when the handler is shut down). Ensure all other parameters (h.privateKey, preimage, &swapResp, swap, params.EventCallback, params.UnilateralRefundCB, params.UserBtcAddress) remain unchanged.swap/batch_handler.go-398-413 (1)
398-413:⚠️ Potential issue | 🟠 MajorEnforce the single-forfeit assumption in collaborative refunds.
This constructor accepts any number of
vtxosToForfeit, but the finalization path only consumes onepartialForfeitTx, usesconnectors[0], and submits a single signed tx. If more than one VTXO needs a collaborative refund, the rest are dropped silently.Suggested fix
if len(handler.vtxosToForfeit) > 0 && partialForfeitTx == "" { return nil, fmt.Errorf("missing partial forfeit tx") } + if len(handler.vtxosToForfeit) > 1 { + return nil, fmt.Errorf("collaborative refund supports exactly one forfeitable vtxo") + } return &collabRefundBatchSessionHandler{🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/batch_handler.go` around lines 398 - 413, The constructor accepts multiple vtxosToForfeit but the collabRefundBatchSessionHandler finalization only handles one partialForfeitTx and uses connectors[0], dropping extras; fix by enforcing the single-forfeit assumption: after newBatchSessionHandler returns, add a guard that if len(handler.vtxosToForfeit) > 1 then return an error (e.g., "multiple vtxosToForfeit not supported") unless you also accept and thread multiple partialForfeitTx values and connector handling; reference newBatchSessionHandler, collabRefundBatchSessionHandler, vtxosToForfeit, partialForfeitTx and connectors to locate where to add the check.swap/boltz/types.go-189-213 (1)
189-213:⚠️ Potential issue | 🟠 Major
GetSwapTreecan panic when the selected leg has no BTC tree.
SwapLeg.SwapTreeis a pointer and this same file marks it as BTC-specific, butGetSwapTreedereferences it unconditionally. A malformed/error response or wrong-leg selection will panic during resume/validation instead of returning a normal error.Suggested fix
-func (c CreateChainSwapResponse) GetSwapTree(isArkToBtc bool) SwapTree { - if isArkToBtc { - return SwapTree{ - ClaimLeaf: SwapTreeLeaf{ - Version: c.ClaimDetails.SwapTree.ClaimLeaf.Version, - Output: c.ClaimDetails.SwapTree.ClaimLeaf.Output, - }, - RefundLeaf: SwapTreeLeaf{ - Version: c.ClaimDetails.SwapTree.RefundLeaf.Version, - Output: c.ClaimDetails.SwapTree.RefundLeaf.Output, - }, - } - } - - return SwapTree{ - ClaimLeaf: SwapTreeLeaf{ - Version: c.LockupDetails.SwapTree.ClaimLeaf.Version, - Output: c.LockupDetails.SwapTree.ClaimLeaf.Output, - }, - RefundLeaf: SwapTreeLeaf{ - Version: c.LockupDetails.SwapTree.RefundLeaf.Version, - Output: c.LockupDetails.SwapTree.RefundLeaf.Output, - }, - } +func (c CreateChainSwapResponse) GetSwapTree(isArkToBtc bool) (*SwapTree, error) { + var leg SwapLeg + if isArkToBtc { + leg = c.ClaimDetails + } else { + leg = c.LockupDetails + } + if leg.SwapTree == nil { + return nil, fmt.Errorf("missing BTC swap tree in response") + } + return leg.SwapTree, nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/boltz/types.go` around lines 189 - 213, GetSwapTree currently dereferences BTC-specific pointer fields unconditionally which can panic when SwapLeg.SwapTree is nil; update CreateChainSwapResponse.GetSwapTree to nil-check the pointer fields before accessing them (both c.ClaimDetails.SwapTree and c.LockupDetails.SwapTree / their ClaimLeaf/RefundLeaf) and return a clear error or zero-value SwapTree when the selected leg has no BTC tree, e.g., detect nil and handle by returning an error (or a safe default) instead of dereferencing; ensure the function signature and callers of GetSwapTree (CreateChainSwapResponse.GetSwapTree) are updated to handle the error return if you choose to return an error.swap/batch_handler.go-148-151 (1)
148-151:⚠️ Potential issue | 🟠 MajorFilter
OnBatchFailedby batch session ID to avoid aborting on unrelated batch failures.
OnBatchFinalizedalready filters events byevent.Id == h.batchSessionId, andOnBatchStartedonly setsbatchSessionIdwhen the intent matches.OnBatchFailedmust apply the same check—otherwise, failures for unrelated batches will abort this swap. SincebatchSessionIdstarts empty, also guard against that.Suggested fix
func (h *batchSessionHandler) OnBatchFailed( ctx context.Context, event client.BatchFailedEvent, ) error { + if h.batchSessionId == "" || event.Id != h.batchSessionId { + return nil + } return fmt.Errorf("batch failed: %s", event.Reason) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/batch_handler.go` around lines 148 - 151, OnBatchFailed should ignore failures for unrelated batches: add a guard in batchSessionHandler.OnBatchFailed that first checks h.batchSessionId is non-empty and then returns nil unless event.Id == h.batchSessionId; only format and return the error (fmt.Errorf("batch failed: %s", event.Reason)) when the IDs match. This mirrors the filtering used in OnBatchFinalized and prevents aborting the swap on unrelated batch failures.swap/explorer.go-111-121 (1)
111-121:⚠️ Potential issue | 🟠 MajorRoute every explorer call through
e.client.Three methods bypass the client configured in
NewExplorerClient, so they also bypass its 10-second timeout and any future custom transport/retry settings. A stalled explorer call inGetCurrentBlockHeight,GetTransactionStatus, orGetTransactioncan hang swap monitoring indefinitely instead of timing out as intended.Suggested fix
- resp, err := http.Get(endpoint) + resp, err := e.client.Get(endpoint)Apply the same replacement at line 117 (
GetCurrentBlockHeight), line 147 (GetTransactionStatus), and line 189 (GetTransaction).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/explorer.go` around lines 111 - 121, The three methods GetCurrentBlockHeight, GetTransactionStatus, and GetTransaction currently call the global http.Get and bypass the configured e.client (and its timeout/transport), so replace each http.Get call with e.client.Get (keeping endpoint construction via url.JoinPath and existing response handling/Close), ensuring all explorer HTTP requests use the client initialized in NewExplorerClient so they respect the 10s timeout and custom transport/retry settings.test/e2e/utils_test.go-243-259 (1)
243-259:⚠️ Potential issue | 🟠 MajorAdd timeouts to all mock-boltz HTTP helper functions.
All mock-boltz helpers (
setMockBoltzConfig,resetMockBoltz,injectMockBoltzSwapEvent,getMockBoltzSwap,waitForMockBoltzHealth,injectMockBoltzSwapEventWithTx,getMockBoltzSwapTyped) usehttp.DefaultClient.Do()orhttp.Get()without timeouts. A stuck or slow mock-boltz container can hang the entire E2E test suite indefinitely. Create a package-level HTTP client with a timeout (e.g.,var e2eHTTPClient = &http.Client{Timeout: 5 * time.Second}) and use it consistently across all these helpers to fail fast on stalled connections.Suggested fix
+var e2eHTTPClient = &http.Client{Timeout: 5 * time.Second} + func setMockBoltzConfig(t *testing.T, cfg map[string]any) { t.Helper() @@ - resp, err := http.DefaultClient.Do(req) + resp, err := e2eHTTPClient.Do(req) @@ - resp, err := http.Get(url) //nolint:gosec + resp, err := e2eHTTPClient.Get(url)Apply the same replacement throughout all mock-boltz helpers at lines 243–259, 262–272, 276–293, 296–307, 310–324, 430–451, and 463–474.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/utils_test.go` around lines 243 - 259, Create a package-level HTTP client with a timeout (e.g., var e2eHTTPClient = &http.Client{Timeout: 5 * time.Second}) and replace all uses of http.DefaultClient.Do(...) and http.Get(...) in the mock-boltz helpers so they use e2eHTTPClient (for example, in setMockBoltzConfig, resetMockBoltz, injectMockBoltzSwapEvent, getMockBoltzSwap, waitForMockBoltzHealth, injectMockBoltzSwapEventWithTx, and getMockBoltzSwapTyped); ensure request creation and resp handling stays the same but calls use e2eHTTPClient.Do(req) or e2eHTTPClient.Get(url) to enforce the timeout and fail fast on stalled connections.vhtlc/vhtlc.go-88-105 (1)
88-105:⚠️ Potential issue | 🟠 MajorReject mismatched preimage hashes when rebuilding a VHTLC.
NewVhtlcScriptbuildspreimageConditionfrompreimageHash, but never verifies thatclaimLeafandunilateralClaimLeafembed that same condition. That lets callers construct aVHTLCScriptwhose spend paths use one hash whileOpts()reports another, so round-trips and downstream validation become unsound.Suggested fix
import ( + "bytes" "encoding/hex" "fmt" @@ unilateralClaimClosure, err := parseUnilateralClaimClosure(unilateralClaimLeaf) if err != nil { return nil, err } + if !bytes.Equal(claimClosure.Condition, preimageCondition) { + return nil, fmt.Errorf("claim leaf preimage condition mismatch") + } + if !bytes.Equal(unilateralClaimClosure.Condition, preimageCondition) { + return nil, fmt.Errorf("unilateral claim leaf preimage condition mismatch") + }Also applies to: 107-145
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@vhtlc/vhtlc.go` around lines 88 - 105, NewVhtlcScript builds preimageCondition from preimageHash (via makePreimageConditionScript) but does not verify that claimLeaf and unilateralClaimLeaf actually embed that same condition, allowing inconsistent scripts; update NewVhtlcScript to extract the embedded preimage condition from claimLeaf and unilateralClaimLeaf (or parse their script templates) and compare it to the computed preimageCondition, returning an error if any mismatch is found; do the same verification in the alternate construction path that handles the other branch (the similar code later that also builds preimageCondition) so Opts() and the actual spend paths remain consistent.swap/boltz/ws.go-42-42 (1)
42-42:⚠️ Potential issue | 🟠 MajorMake subscription acks non-blocking.
subscriptionsis unbuffered, so if the ack arrives after the timeout path fires—or Boltz sends a duplicate ack—the reader goroutine blocks forever on the send and stops processing updates.Suggested fix
- subscriptions chan bool + subscriptions chan struct{} @@ - subscriptions: make(chan bool), + subscriptions: make(chan struct{}, 1), @@ case "subscribe": - boltz.subscriptions <- true + select { + case boltz.subscriptions <- struct{}{}: + default: + } continueAlso applies to: 67-67, 156-158, 199-204
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/boltz/ws.go` at line 42, The subscriptions channel currently declared as "subscriptions chan bool" must be made non-blocking: change it to a buffered channel (e.g., buffer size 1, or use chan struct{} with buffer 1) and update all send sites that ack subscriptions (the reader goroutine and the places that send subscription acks) to perform a non-blocking send using a select with a default case so duplicate or late acks do not block (e.g., select { case subscriptions <- true: default: }). Ensure receivers still read/drain the channel where intended.swap/chainswap_monitor.go-180-183 (1)
180-183:⚠️ Potential issue | 🟠 MajorDon’t translate
ctx.Done()into a swap failure.Context cancellation is a local lifecycle event, not evidence that the on-chain swap failed. With the new resume flow, shutting down or restarting monitoring would incorrectly persist the swap as failed instead of leaving it resumable.
Suggested fix
case <-ctx.Done(): log.Infof("Context cancelled for swap %s monitoring", swapId) - chainSwapState.Swap.Fail("context cancelled") return🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/chainswap_monitor.go` around lines 180 - 183, The ctx.Done() case should not mark the on-chain swap as failed; remove or stop calling chainSwapState.Swap.Fail("context cancelled") in the select branch that handles <-ctx.Done(). Instead, keep a clear log using swapId and chainSwapState (e.g., log.Infof("Context cancelled for swap %s monitoring", swapId)) and simply return so the monitor exits without persisting a failure, leaving the swap resumable by the resume flow.swap/chainswap_btc_ark_handler.go-272-277 (1)
272-277:⚠️ Potential issue | 🟠 MajorValidate the claim hash length before signing.
hex.DecodeStringonly guarantees valid hex.copy(msg[:], txHashBytes)silently zero-pads or truncates anything not exactly 32 bytes, so you can submit a partial signature over the wrong message.Suggested fix
txHashBytes, err := hex.DecodeString(claimDetails.TransactionHash) if err != nil { return fmt.Errorf("decode transaction hash: %w", err) } + if len(txHashBytes) != 32 { + return fmt.Errorf("transaction hash must be 32 bytes, got %d", len(txHashBytes)) + } var msg [32]byte copy(msg[:], txHashBytes)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/chainswap_btc_ark_handler.go` around lines 272 - 277, The code decodes claimDetails.TransactionHash into txHashBytes and then blindly copies into msg [32]byte, which can silently truncate or zero-pad; update the logic in the handler (where txHashBytes, msg, and copy are used) to validate that len(txHashBytes) == 32 after hex.DecodeString and return a clear error if not (e.g., "invalid transaction hash length: expected 32 bytes, got X") before performing the copy/signing step so you never sign a partial or padded message.vhtlc/vhtlc.go-284-294 (1)
284-294:⚠️ Potential issue | 🟠 MajorReturn a copy of
PreimageHashfromOpts().This currently exposes a subslice of internal state. For scripts built via
NewVHTLCScriptFromOpts, mutating the returnedPreimageHashalso mutates the stored claim condition, so the sameVHTLCScriptinstance can be corrupted after construction.Suggested fix
func (v *VHTLCScript) Opts() Opts { + preimageHash := append([]byte(nil), v.preimageConditionScript[2:2+hash160Len]...) return Opts{ Sender: v.Sender, Receiver: v.Receiver, Server: v.Server, - PreimageHash: v.preimageConditionScript[2 : 2+hash160Len], + PreimageHash: preimageHash, RefundLocktime: v.RefundWithoutReceiverClosure.Locktime, UnilateralClaimDelay: v.UnilateralClaimClosure.Locktime, UnilateralRefundDelay: v.UnilateralRefundClosure.Locktime, UnilateralRefundWithoutReceiverDelay: v.UnilateralRefundWithoutReceiverClosure.Locktime, } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@vhtlc/vhtlc.go` around lines 284 - 294, Opts() currently returns PreimageHash as a subslice of v.preimageConditionScript exposing internal state; change it to return a copy (e.g. use append([]byte(nil), v.preimageConditionScript[2:2+hash160Len]) or allocate and copy) so modifying the returned Opts.PreimageHash does not mutate VHTLCScript.preimageConditionScript; update VHTLCScript.Opts() to produce that copied byte slice.swap/chainswap.go-920-943 (1)
920-943:⚠️ Potential issue | 🟠 MajorMake the confirmation wait loop cancellable.
After broadcast, this loop sleeps/polls for up to two hours without checking
ctx.Done(). That makes shutdown and request timeouts ineffective and can pin goroutines under load.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/chainswap.go` around lines 920 - 943, The confirmation wait loop that polls GetTransactionStatus for claimTxid (using startTime, maxWaitTime, pollInterval) is not observing context cancellation; refactor the loop in the method containing this code to use a select that listens on ctx.Done() as well as a time.After/poll ticker so you can cancel immediately: call h.explorerClient.GetTransactionStatus(ctx, ...) if that client supports context or invoke it inside the select when ready, and when ctx is canceled return/exit the function (or set an appropriate error) instead of continuing to sleep — ensure confirmed handling (setting confirmed = true and breaking) remains unchanged when a confirmation is received.swap/chainswap.go-654-661 (1)
654-661:⚠️ Potential issue | 🟠 MajorEmit the BTC funding address in the create event.
NewChainSwapfiresCreateEventimmediately, but this call passes""and only fillsUserBtcLockupAddressafterward. Any listener persisting the create event will store an empty funding address for BTC→ARK swaps.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/chainswap.go` around lines 654 - 661, CreateEvent is fired inside NewChainSwap but the BTC funding address is passed as an empty string, so listeners persist an empty address; fix by supplying the real address (swapResp.LockupDetails.LockupAddress) when constructing the swap instead of "" (or move setting chainSwap.UserBtcLockupAddress to before NewChainSwap is invoked). Update the NewChainSwap call at the call site (and its parameter list usage) so the funding address is provided when CreateEvent is emitted.swap/swap.go-821-827 (1)
821-827:⚠️ Potential issue | 🟠 MajorDon't detach the collaborative refund from the swap context.
This failure path switches to
context.Background()before callingRefundSwap. If Boltz or arkd stalls here,submarineSwapcan block indefinitely and never reach the unilateral fallback.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/swap.go` around lines 821 - 827, The RefundSwap call is using context.Background() which detaches the collaborative refund from the swap’s context and can block forever; change the call to pass the existing swap context (e.g., ctx or the swap-specific context used in submarineSwap) into h.RefundSwap so cancellation/timeout propagate and the unilateral fallback can proceed if Boltz/arkd stalls; update the call site of h.RefundSwap (the RefundSwap invocation within submarineSwap) to use the existing context and ensure that context carries any deadlines/cancellation already applied to the swap.swap/chainswap.go-47-77 (1)
47-77: 🛠️ Refactor suggestion | 🟠 MajorThe exported mutable fields undermine the mutex.
Status,Error, the txids, andUserBtcLockupAddressare updated undermu, but callers can still read or write them directly without taking the lock. IfChainSwapis meant to be thread-safe, those fields need to be private behind getters and transition methods.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/chainswap.go` around lines 47 - 77, The exported mutable fields on ChainSwap (Status, Error, UserBtcLockupAddress, UserLockTxid, ServerLockTxid, ClaimTxid, RefundTxid) bypass the mutex; make these fields unexported (e.g. status, err, userBtcLockupAddress, userLockTxid, serverLockTxid, claimTxid, refundTxid) and provide thread-safe accessor and transition methods on ChainSwap (GetStatus, SetStatus/TransitionToX, GetError/SetError, GetUserBtcLockupAddress, GetUserLockTxid, etc.) that use mu (RLock/RUnlock for getters, Lock/Unlock for setters/transitions); update all callers/tests to use the new accessors and any swap-state transition logic to go through the ChainSwap methods so the mutex actually protects those fields.swap/chainswap.go-691-693 (1)
691-693:⚠️ Potential issue | 🟠 MajorUse the provided context for collaborative refunds.
This method accepts
ctxbut detaches the actualRefundSwapcall from it. A stuck Boltz/ARK request can't be canceled and delays the fallback path indefinitely.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/chainswap.go` around lines 691 - 693, The RefundSwap call is using context.Background(), detaching the operation from the caller's cancelation; change the h.RefundSwap invocation in chainswap.go to pass the provided ctx instead of context.Background() so the RefundSwap(SwapTypeChain, swapId, true, vhtlcOpts) call can be cancelled when the outer context is cancelled, preserving the existing arguments (swapId, SwapTypeChain, true, vhtlcOpts) and ensuring cooperative cancellation for the Boltz/ARK request.swap/swap.go-169-177 (1)
169-177:⚠️ Potential issue | 🟠 MajorDon't silently drop extra VTXOs for the same script.
getVHTLCFundscan return multiple outputs, but these paths either take onlyvtxos[0]or sum all amounts while still passing just the first input into the batch session. That can strand funds or make the session amount no longer match its actual inputs.Also applies to: 315-323, 1141-1164
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/swap.go` around lines 169 - 177, getVHTLCFunds can return multiple vtxos but the code currently only uses vtxos[0] (e.g., the vtxo variable) while elsewhere sometimes sums all amounts, which can strand funds or make batch session amounts inconsistent; update the logic in functions referencing getVHTLCFunds (examples around the vtxos handling at the vHTLC lookup and the code paths at lines noted) to either (A) consume all returned vtxos by passing every output as inputs into the batch/session creation and use the summed amount consistently, or (B) select a single output deterministically and only use its amount when creating the session; ensure ErrorNoVtxosFound is still returned when len(vtxos)==0 and replace usages of vtxos[0] with a loop/collection of inputs (or a deterministic selector) so the session inputs and session amount always match the actual inputs.
🟡 Minor comments (3)
swap/boltz/status.go-25-25 (1)
25-25:⚠️ Potential issue | 🟡 MinorTypo in constant name:
TransactionServerMempooolhas an extra 'o'.Should be
TransactionServerMempoolwith two 'o's.✏️ Proposed fix
- TransactionServerMempoool + TransactionServerMempoolAnd update the map key reference on line 51:
- "transaction.server.mempool": TransactionServerMempoool, + "transaction.server.mempool": TransactionServerMempool,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/boltz/status.go` at line 25, Rename the misspelled constant TransactionServerMempoool to TransactionServerMempool (remove the extra 'o') and update any usages that reference the old name—specifically update the map key that currently uses TransactionServerMempoool to use TransactionServerMempool so all references (constant declaration and map lookup) match the corrected identifier.swap/bolt12_test.go-42-49 (1)
42-49:⚠️ Potential issue | 🟡 MinorArguments to
require.Equalare in reversed order.Same issue as above - the expected value should come before the actual value.
🐛 Proposed fix for correct argument order
decoded_invoice, err := swap.DecodeBolt12Invoice(invoice) require.NoError(t, err) - require.Equal(t, decoded_invoice.AmountInSats, uint64(5000)) + require.Equal(t, uint64(5000), decoded_invoice.AmountInSats) require.Equal( t, - hex.EncodeToString(decoded_invoice.PaymentHash160), "c4f9f68a9a92d115970386c1e4bd7308941278fc", + hex.EncodeToString(decoded_invoice.PaymentHash160), )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/bolt12_test.go` around lines 42 - 49, The require.Equal assertions have the expected and actual arguments reversed; update the two require.Equal calls so the expected value comes first and the actual value second (i.e., call require.Equal(t, uint64(5000), decoded_invoice.AmountInSats) and require.Equal(t, "c4f9f68a9a92d115970386c1e4bd7308941278fc", hex.EncodeToString(decoded_invoice.PaymentHash160))) to correct the order for decoded_invoice.AmountInSats and hex.EncodeToString(decoded_invoice.PaymentHash160).swap/bolt12_test.go-27-35 (1)
27-35:⚠️ Potential issue | 🟡 MinorArguments to
require.Equalare in reversed order.The
testifyconvention isrequire.Equal(t, expected, actual), but the code passes actual first. This causes confusing failure messages when tests fail (showing "expected X but got Y" with X and Y swapped).🐛 Proposed fix for correct argument order
decoded_offer, err := swap.DecodeBolt12Offer(offer) require.NoError(t, err) - require.Equal(t, decoded_offer.AmountInSats, uint64(5000)) - require.Equal(t, decoded_offer.DescriptionStr, "fun") + require.Equal(t, uint64(5000), decoded_offer.AmountInSats) + require.Equal(t, "fun", decoded_offer.DescriptionStr) require.Equal( t, - decoded_offer.ID, "4435a4b44691f3e2164dae39814ba0ed13b5557ed1e11bc72f7f2e80e336d4c3", + decoded_offer.ID, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/bolt12_test.go` around lines 27 - 35, The require.Equal calls in the test using decoded_offer are passing actual before expected; change them to the testify convention require.Equal(t, expected, actual). Specifically update the assertions that reference decoded_offer.AmountInSats, decoded_offer.DescriptionStr, and decoded_offer.ID in the Test that calls swap.DecodeBolt12Offer(offer) so each call becomes require.Equal(t, <expected literal>, decoded_offer.<field>) (leave require.NoError as-is).
🧹 Nitpick comments (7)
test/infra/mockboltz/server.go (1)
1151-1195: HTTP client lacks timeout infetchArkSignerPubKey.The
http.Getcall uses the default client with no timeout. While the retry loop limits total attempts, each individual request could hang indefinitely.♻️ Add HTTP client timeout
func fetchArkSignerPubKey(arkdURL string) (*btcec.PublicKey, error) { if strings.TrimSpace(arkdURL) == "" { return nil, fmt.Errorf("MOCK_BOLTZ_ARKD_URL is required") } url := strings.TrimRight(strings.TrimSpace(arkdURL), "/") + "/v1/info" + client := &http.Client{Timeout: 5 * time.Second} var lastErr error for i := 0; i < 30; i++ { - resp, err := http.Get(url) + resp, err := client.Get(url) if err != nil {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/mockboltz/server.go` around lines 1151 - 1195, fetchArkSignerPubKey currently calls http.Get (default client) which can hang; replace the http.Get usage with an explicit http.Client that has a reasonable Timeout (e.g., a few seconds) and use client.Get(url) inside the retry loop, preserving existing retry logic and error handling (keep lastErr, url, payload parsing, and btcec.ParsePubKey behavior) so each attempt times out instead of potentially blocking indefinitely.test/infra/setup_infra.sh (1)
42-53: Unused variableoutinrun_quietfunction.The
outvariable is assigned on success but never read anywhere in the script. Shellcheck correctly flags this (SC2034).♻️ Remove unused variable or use it
If stdout capture isn't needed:
run_quiet() { local output output=$("$@" 2>&1) status=$? if [ $status -eq 0 ]; then - out="$output" return 0 else err="$output" return $status fi }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/setup_infra.sh` around lines 42 - 53, The run_quiet function assigns the captured stdout to a variable named out that is never used; remove the unused variable to satisfy Shellcheck SC2034 or, if the caller needs the output, expose it (e.g., set a named global or echo the output) instead of assigning to out. Specifically, in run_quiet keep capturing output into local output and status, then on success either echo "$output" (or assign it to a caller-visible variable) and return 0, or simply drop the assignment to out and only set err on failure so err and status remain meaningful. Ensure you update uses of run_quiet accordingly if you choose to return/echo the stdout.swap/boltz/status.go (1)
58-60:ParseEventreturns zero value for unknown events, which equalsSwapCreated.When an unrecognized event string is passed, the map lookup returns the zero value
0, which is indistinguishable fromSwapCreated. Consider returning a sentinel value or using the comma-ok idiom.♻️ Proposed fix with explicit unknown handling
+const ( + SwapUnknown SwapUpdateEvent = iota - 1 + SwapCreated + // ... rest unchanged +) + func ParseEvent(event string) SwapUpdateEvent { - return swapUpdateEventStrings[event] + if e, ok := swapUpdateEventStrings[event]; ok { + return e + } + return SwapUnknown }Alternatively, add a second return value:
func ParseEvent(event string) (SwapUpdateEvent, bool) { e, ok := swapUpdateEventStrings[event] return e, ok }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/boltz/status.go` around lines 58 - 60, ParseEvent currently returns swapUpdateEventStrings[event] which yields the zero value on unknown keys (same as SwapCreated), causing ambiguity; update handling in ParseEvent by using the comma-ok idiom to check the map (e, ok := swapUpdateEventStrings[event]) and return a second boolean (func ParseEvent(event string) (SwapUpdateEvent, bool)) or alternatively define an explicit sentinel like SwapUpdateEventUnknown and return that when ok is false; ensure you update callers to handle the new return or sentinel.test/infra/fulmine.Dockerfile (1)
17-28: Consider running the container as a non-root user.The container runs as root by default, which is flagged by security scanners. While this is a test infrastructure container, adding a non-root user is a good security practice.
🔒 Proposed fix to add non-root user
# Final image FROM alpine:3.20 +RUN addgroup -S fulmine && adduser -S fulmine -G fulmine + WORKDIR /app COPY --from=go-builder /app/bin/* /app/ ENV PATH="/app:${PATH}" ENV FULMINE_DATADIR=/app/data +RUN chown -R fulmine:fulmine /app VOLUME /app/data +USER fulmine + ENTRYPOINT [ "fulmine" ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/fulmine.Dockerfile` around lines 17 - 28, Add a non-root user and switch to it before ENTRYPOINT: create a user/group (e.g., fulmine:fulmine), create and chown the data directory and /app to that user (referencing WORKDIR /app and ENV FULMINE_DATADIR), and set USER to that non-root user prior to ENTRYPOINT ["fulmine"]; ensure permissions allow the process to write to $FULMINE_DATADIR and that binaries in /app are executable by the new user.swap/bolt12_test.go (1)
11-16: Consider using idiomatic Go naming conventions.Go convention prefers camelCase for variable names.
invalid_offerandinvalid_invoicewould be more idiomatic asinvalidOfferandinvalidInvoice.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/bolt12_test.go` around lines 11 - 16, Rename the snake_case test variables to idiomatic camelCase: change invalid_offer to invalidOffer and invalid_invoice to invalidInvoice, then update every reference to these symbols (invalid_offer, invalid_invoice) throughout the test file (e.g., in assertions and helper calls) so the code compiles and follows Go naming conventions; leave other variables (offer, invoice) as-is unless you want to similarly CamelCase them.swap/boltz/utils.go (1)
13-30: Context cancellation not detected during sleep interval.The current implementation doesn't respond to context cancellation while waiting on
time.After(interval). If the context is cancelled mid-sleep, the function won't exit until the full interval elapses and the next iteration begins.♻️ Proposed fix to make sleep context-aware
for { select { case <-ctx.Done(): if errors.Is(ctx.Err(), context.DeadlineExceeded) { return fmt.Errorf("timed out") } return ctx.Err() default: done, err := fn(ctx) if err != nil { return err } if done { return nil } - <-time.After(interval) + select { + case <-ctx.Done(): + if errors.Is(ctx.Err(), context.DeadlineExceeded) { + return fmt.Errorf("timed out") + } + return ctx.Err() + case <-time.After(interval): + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/boltz/utils.go` around lines 13 - 30, The loop that repeatedly calls fn(ctx) doesn't observe context cancellation during the sleep because it uses <-time.After(interval); update the wait so it is context-aware by replacing the plain sleep with a select that waits on either ctx.Done() or time.After(interval) and returns ctx.Err() (or wraps a deadline message) if context is canceled; locate the polling loop where fn(ctx) is invoked and the current <-time.After(interval) is used and change that single wait to a select on ctx.Done() and time.After(interval) so cancellation is handled immediately.test/infra/docker-compose.yml (1)
268-270: Pin the Boltz image to a release tag or digest.Using
boltz/boltz:latestmakes the happy-path stack non-reproducible, so upstream image changes can break CI/regtest without any repo change. Please pin this to a known-good tag or digest.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/docker-compose.yml` around lines 268 - 270, The docker-compose service "boltz" is using the floating image tag "boltz/boltz:latest" which makes tests non-reproducible; update the "boltz" service in docker-compose.yml to pin the image to a known-good release tag or an immutable digest (e.g., replace "boltz/boltz:latest" with a specific tag like "boltz/boltz:vX.Y.Z" or "boltz/boltz@sha256:..."), and include a brief comment noting the chosen known-good version so CI/regtest remains reproducible and easy to update in future.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 877aa83a-8ebe-4292-ac35-56457d3b432e
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (46)
.github/workflows/integration.yamlMakefilego.modswap/batch_handler.goswap/bolt12.goswap/bolt12_test.goswap/boltz/boltz.goswap/boltz/status.goswap/boltz/types.goswap/boltz/utils.goswap/boltz/ws.goswap/btc_util.goswap/chainswap.goswap/chainswap_arkt_btc_handler.goswap/chainswap_btc_ark_handler.goswap/chainswap_monitor.goswap/chainswap_resume.goswap/chainswap_validate.goswap/explorer.goswap/musig.goswap/musig2_handler.goswap/swap.goswap/utils.gotest/docker/docker-compose.ymltest/docker/setup.gotest/e2e/asset_test.gotest/e2e/batch_session_test.gotest/e2e/exit_test.gotest/e2e/history_test.gotest/e2e/swap_test.gotest/e2e/transaction_test.gotest/e2e/utils_test.gotest/infra/arkd.Dockerfiletest/infra/balances.shtest/infra/cln.Dockerfiletest/infra/docker-compose.ymltest/infra/fulmine.Dockerfiletest/infra/mockboltz/Dockerfiletest/infra/mockboltz/server.gotest/infra/setup_infra.shtest/infra/wallet.Dockerfilevhtlc/opts.govhtlc/utils.govhtlc/vhtlc.govhtlc/vhtlc_fixtures.jsonvhtlc/vhtlc_test.go
💤 Files with no reviewable changes (2)
- test/docker/docker-compose.yml
- test/docker/setup.go
🔍 Arkana PR Review —
|
|
Review (iterative — 2 new commits since last review) Cleanup/polish round. Changes:
Dockerfiles — Go 1.26.0 → 1.26.1 across Formatting — Line wrapping in No concerns with this round. |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@swap/boltz/ws.go`:
- Around line 43-49: The subscriptions channel is currently unbuffered (field
subscriptions chan bool) causing the reader goroutine to block if an ack arrives
when no goroutine is waiting; change subscriptions to a buffered channel (e.g.,
make(chan bool, 1)) and update all sends in the reader (the code path that
handles the "subscribe" ack) to perform non-blocking sends using a select with a
default case so an unexpected/delayed ack is dropped rather than blocking; apply
the same non-blocking send pattern wherever subscriptions is written (including
the subscribe()/ack handling and any reconnect code paths referenced around the
reader and reconnect logic) so reader/reconnect goroutines cannot deadlock on a
stalled receiver.
In `@swap/btc_util.go`:
- Around line 177-191: The control block builder must preserve the original leaf
versions from the swap tree instead of always writing BaseLeafVersion: update
createControlBlockFromSwapTree to use tree.ClaimLeaf.Version and
tree.RefundLeaf.Version when constructing the tap leaves (instead of calling
NewBaseTapLeaf/unconditionally inserting BaseLeafVersion), so the version bytes
written into the control block match the versions used by
computeSwapTreeMerkleRoot (which hashes tree.ClaimLeaf.Version and
tree.RefundLeaf.Version via tapLeafHash); ensure any helper used to build the
leaf or control block accepts/propagates the explicit Version value so the
control block and merkle root remain consistent.
In `@swap/chainswap_arkt_btc_handler.go`:
- Around line 596-604: The findLockupOutput function currently casts
wire.TxOut.Value to uint64 without validating it; add a guard in
findLockupOutput that checks each TxOut.Value from lockupTx.TxOut and rejects
non-positive values (Value <= 0) by returning an error (include txid/vout
context) before creating lockupTxOutput so you never cast negative or zero
values to uint64; update any callers that rely on the error to handle the
failure path if necessary.
In `@swap/chainswap_btc_ark_handler.go`:
- Around line 285-290: Check that claimDetails.TransactionHash decodes to
exactly 32 bytes before copying into msg: after hex.DecodeString into
txHashBytes, validate len(txHashBytes) == 32 and return a clear error (including
the invalid length and the original TransactionHash) if not; keep the fixed-size
buffer declaration var msg [32]byte and only perform copy(msg[:], txHashBytes)
once the length check passes. This change affects the transaction hash handling
around txHashBytes, claimDetails.TransactionHash and the msg [32]byte buffer
used for signing (from GetChainSwapClaimDetails flow).
In `@swap/swap.go`:
- Around line 169-177: ClaimVHTLC and RefundSwap currently only use the first
vtxo (vtxos[0]) returned by getVHTLCFunds while getBatchSessionArgs sums all
matching VTXOs into totalAmount, causing leftover funds and batch IO imbalance;
modify ClaimVHTLC, RefundSwap, and getBatchSessionArgs to iterate over the full
vtxos slice (not just vtxos[0]), aggregate their amounts into totalAmount, and
build transaction inputs/intents for every vtxo returned by getVHTLCFunds so all
matching outputs are consumed (update any use of the single variable vtxo to use
the full vtxos collection and ensure downstream functions that expect a single
input accept a list or are adjusted to process multiple inputs).
- Around line 774-777: The Boltz websocket created in submarineSwap via
h.boltzSvc.NewWebsocket() and ws.ConnectAndSubscribe(...) is never closed;
ensure you call ws.Close() so the reader/ping goroutines and remote subscription
are torn down: after a successful ConnectAndSubscribe, immediately defer
ws.Close() inside submarineSwap so it runs on every return, and if
ConnectAndSubscribe returns an error call ws.Close() before returning the error
(or ensure Close is invoked in the error path); reference NewWebsocket(),
ws.ConnectAndSubscribe(...), ws.Close(), and the submarineSwap function when
making the change.
- Around line 69-88: NewSwapHandler currently dereferences potentially nil
inputs and results (calling arkClient.GetConfigData, using *cfg and
privateKey.PubKey) which can panic; update NewSwapHandler to validate inputs
before dereferencing: return an error if arkClient or privateKey is nil, call
arkClient.GetConfigData only after checking arkClient and handle a nil/invalid
cfg result instead of dereferencing *cfg (or copy cfg safely), and call
privateKey.PubKey() only after confirming privateKey is non-nil; ensure all
failures return descriptive errors from NewSwapHandler (e.g., "nil arkClient",
"failed to get config data: %w", "nil privateKey") rather than allowing a panic.
In `@swap/utils.go`:
- Around line 244-256: In decodeInvoice, detect and reject invoices where
bolt11.MSatoshi is not an exact multiple of 1000 (i.e. bolt11.MSatoshi % 1000 !=
0) so non-whole-satoshi invoices fail instead of being floored; update
decodeInvoice to return an error when that condition is met (preserving the
existing return types) before computing amount := uint64(bolt11.MSatoshi / 1000)
and the preimage hash (this prevents reverseSwap from receiving an incorrect sat
amount).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: da599945-3720-4dc8-957f-4b8dbdb6468d
📒 Files selected for processing (10)
.github/workflows/integration.yaml.github/workflows/unit.yamlswap/bolt12.goswap/boltz/ws.goswap/btc_util.goswap/chainswap.goswap/chainswap_arkt_btc_handler.goswap/chainswap_btc_ark_handler.goswap/swap.goswap/utils.go
✅ Files skipped from review due to trivial changes (2)
- .github/workflows/unit.yaml
- .github/workflows/integration.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- swap/bolt12.go
- swap/chainswap.go
| subscriptions chan bool | ||
| mu sync.RWMutex | ||
| conn *websocket.Conn | ||
| closed bool | ||
| reconnect bool | ||
| dialer *websocket.Dialer | ||
| swapIds []string |
There was a problem hiding this comment.
Make subscription acknowledgements non-blocking.
subscriptions is an unbuffered shared channel. If the "subscribe" ack arrives after the 5-second timeout—or when no subscribe() call is currently waiting—the send in the reader goroutine blocks forever, and no more updates or reconnect handling can run.
Suggested fix
type Websocket struct {
Updates chan SwapUpdate
apiUrl string
- subscriptions chan bool
+ subscriptions chan struct{}
mu sync.RWMutex
conn *websocket.Conn
@@
return &Websocket{
apiUrl: boltz.WSURL,
- subscriptions: make(chan bool),
+ subscriptions: make(chan struct{}, 1),
dialer: &dialer,
Updates: make(chan SwapUpdate),
}
}
@@
case "subscribe":
- boltz.subscriptions <- true
+ select {
+ case boltz.subscriptions <- struct{}{}:
+ default:
+ }
continueAlso applies to: 166-168, 226-230
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@swap/boltz/ws.go` around lines 43 - 49, The subscriptions channel is
currently unbuffered (field subscriptions chan bool) causing the reader
goroutine to block if an ack arrives when no goroutine is waiting; change
subscriptions to a buffered channel (e.g., make(chan bool, 1)) and update all
sends in the reader (the code path that handles the "subscribe" ack) to perform
non-blocking sends using a select with a default case so an unexpected/delayed
ack is dropped rather than blocking; apply the same non-blocking send pattern
wherever subscriptions is written (including the subscribe()/ack handling and
any reconnect code paths referenced around the reader and reconnect logic) so
reader/reconnect goroutines cannot deadlock on a stalled receiver.
| func computeSwapTreeMerkleRoot(tree boltz.SwapTree) ([]byte, error) { | ||
| claimScript, err := hex.DecodeString(tree.ClaimLeaf.Output) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("decode claim leaf script: %w", err) | ||
| } | ||
| refundScript, err := hex.DecodeString(tree.RefundLeaf.Output) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("decode refund leaf script: %w", err) | ||
| } | ||
|
|
||
| claimLeafHash := tapLeafHash(tree.ClaimLeaf.Version, claimScript) | ||
| refundLeafHash := tapLeafHash(tree.RefundLeaf.Version, refundScript) | ||
|
|
||
| h := computeMerkleRoot(claimLeafHash[:], refundLeafHash[:]) | ||
| return h[:], nil |
There was a problem hiding this comment.
Preserve the swap-tree leaf versions when building the control block.
computeSwapTreeMerkleRoot hashes swapTree.ClaimLeaf.Version and swapTree.RefundLeaf.Version, but createControlBlockFromSwapTree recreates both leaves with NewBaseTapLeaf and always writes BaseLeafVersion into the control block. If Boltz ever returns a non-base leaf version, the control block no longer matches the merkle root you tweaked against, and the script-path claim will fail.
Also applies to: 230-253
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@swap/btc_util.go` around lines 177 - 191, The control block builder must
preserve the original leaf versions from the swap tree instead of always writing
BaseLeafVersion: update createControlBlockFromSwapTree to use
tree.ClaimLeaf.Version and tree.RefundLeaf.Version when constructing the tap
leaves (instead of calling NewBaseTapLeaf/unconditionally inserting
BaseLeafVersion), so the version bytes written into the control block match the
versions used by computeSwapTreeMerkleRoot (which hashes tree.ClaimLeaf.Version
and tree.RefundLeaf.Version via tapLeafHash); ensure any helper used to build
the leaf or control block accepts/propagates the explicit Version value so the
control block and merkle root remain consistent.
| func findLockupOutput(lockupTx *wire.MsgTx, expectedPkScript []byte) (*lockupTxOutput, error) { | ||
| for vout, out := range lockupTx.TxOut { | ||
| if bytes.Equal(out.PkScript, expectedPkScript) { | ||
| return &lockupTxOutput{ | ||
| txid: lockupTx.TxHash().String(), | ||
| vout: uint32(vout), | ||
| amount: uint64(out.Value), | ||
| pkScript: out.PkScript, | ||
| }, nil |
There was a problem hiding this comment.
Reject non-positive lockup outputs before casting to uint64.
wire.TxOut.Value is signed. If a malformed lockup transaction contains Value <= 0, the cast to uint64 wraps negatives into a huge amount and the claim builder works with nonsense funds instead of failing cleanly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@swap/chainswap_arkt_btc_handler.go` around lines 596 - 604, The
findLockupOutput function currently casts wire.TxOut.Value to uint64 without
validating it; add a guard in findLockupOutput that checks each TxOut.Value from
lockupTx.TxOut and rejects non-positive values (Value <= 0) by returning an
error (include txid/vout context) before creating lockupTxOutput so you never
cast negative or zero values to uint64; update any callers that rely on the
error to handle the failure path if necessary.
| txHashBytes, err := hex.DecodeString(claimDetails.TransactionHash) | ||
| if err != nil { | ||
| return fmt.Errorf("decode transaction hash: %w", err) | ||
| } | ||
| var msg [32]byte | ||
| copy(msg[:], txHashBytes) |
There was a problem hiding this comment.
Validate TransactionHash length before copying it into the signing buffer.
copy(msg[:], txHashBytes) silently zero-pads short hashes and truncates long ones. A malformed GetChainSwapClaimDetails response would still produce a partial signature, just over the wrong message.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@swap/chainswap_btc_ark_handler.go` around lines 285 - 290, Check that
claimDetails.TransactionHash decodes to exactly 32 bytes before copying into
msg: after hex.DecodeString into txHashBytes, validate len(txHashBytes) == 32
and return a clear error (including the invalid length and the original
TransactionHash) if not; keep the fixed-size buffer declaration var msg [32]byte
and only perform copy(msg[:], txHashBytes) once the length check passes. This
change affects the transaction hash handling around txHashBytes,
claimDetails.TransactionHash and the msg [32]byte buffer used for signing (from
GetChainSwapClaimDetails flow).
| func NewSwapHandler( | ||
| arkClient arksdk.ArkClient, | ||
| boltzSvc *boltz.Api, | ||
| esploraURL string, | ||
| privateKey *btcec.PrivateKey, | ||
| timeout uint32, | ||
| ) (*SwapHandler, error) { | ||
| cfg, err := arkClient.GetConfigData(context.Background()) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get config data: %w", err) | ||
| } | ||
| return &SwapHandler{ | ||
| arkClient: arkClient, | ||
| boltzSvc: boltzSvc, | ||
| explorerClient: NewExplorerClient(esploraURL), | ||
| privateKey: privateKey, | ||
| publicKey: privateKey.PubKey(), | ||
| timeout: timeout, | ||
| config: *cfg, | ||
| }, nil |
There was a problem hiding this comment.
Validate constructor dependencies before dereferencing them.
arkClient.GetConfigData(...), privateKey.PubKey(), and *cfg can all panic here. Since NewSwapHandler already returns an error, bad inputs should fail fast with a regular error instead of crashing the process.
Suggested fix
func NewSwapHandler(
arkClient arksdk.ArkClient,
boltzSvc *boltz.Api,
esploraURL string,
privateKey *btcec.PrivateKey,
timeout uint32,
) (*SwapHandler, error) {
+ if arkClient == nil {
+ return nil, errors.New("nil ark client")
+ }
+ if boltzSvc == nil {
+ return nil, errors.New("nil boltz service")
+ }
+ if privateKey == nil {
+ return nil, errors.New("nil private key")
+ }
+
cfg, err := arkClient.GetConfigData(context.Background())
if err != nil {
return nil, fmt.Errorf("failed to get config data: %w", err)
}
+ if cfg == nil {
+ return nil, errors.New("nil config data")
+ }
return &SwapHandler{
arkClient: arkClient,
boltzSvc: boltzSvc,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@swap/swap.go` around lines 69 - 88, NewSwapHandler currently dereferences
potentially nil inputs and results (calling arkClient.GetConfigData, using *cfg
and privateKey.PubKey) which can panic; update NewSwapHandler to validate inputs
before dereferencing: return an error if arkClient or privateKey is nil, call
arkClient.GetConfigData only after checking arkClient and handle a nil/invalid
cfg result instead of dereferencing *cfg (or copy cfg safely), and call
privateKey.PubKey() only after confirming privateKey is non-nil; ensure all
failures return descriptive errors from NewSwapHandler (e.g., "nil arkClient",
"failed to get config data: %w", "nil privateKey") rather than allowing a panic.
| vtxos, err := h.getVHTLCFunds(ctx, []*vhtlc.VHTLCScript{vHTLC}) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| if len(vtxos) == 0 { | ||
| return "", ErrorNoVtxosFound | ||
| } | ||
|
|
||
| vtxo := &vtxos[0] |
There was a problem hiding this comment.
Multi-VTXO swaps are only partially handled.
ClaimVHTLC and RefundSwap pick vtxos[0], while getBatchSessionArgs sums every matching VTXO into totalAmount but still only forwards vtxos[0]. If a swap script accumulates more than one VTXO, the direct paths leave funds behind and the batch-session intents can become input/output imbalanced.
Also applies to: 315-323, 1158-1188
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@swap/swap.go` around lines 169 - 177, ClaimVHTLC and RefundSwap currently
only use the first vtxo (vtxos[0]) returned by getVHTLCFunds while
getBatchSessionArgs sums all matching VTXOs into totalAmount, causing leftover
funds and batch IO imbalance; modify ClaimVHTLC, RefundSwap, and
getBatchSessionArgs to iterate over the full vtxos slice (not just vtxos[0]),
aggregate their amounts into totalAmount, and build transaction inputs/intents
for every vtxo returned by getVHTLCFunds so all matching outputs are consumed
(update any use of the single variable vtxo to use the full vtxos collection and
ensure downstream functions that expect a single input accept a list or are
adjusted to process multiple inputs).
| ws := h.boltzSvc.NewWebsocket() | ||
| if err := ws.ConnectAndSubscribe(ctx, []string{swap.Id}, 5*time.Second); err != nil { | ||
| return nil, err | ||
| } |
There was a problem hiding this comment.
Always close the Boltz websocket opened for submarineSwap.
This function can return from several later success/failure branches, but the websocket created here is never closed. That leaves the reader/ping goroutines and the remote subscription alive after the swap finishes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@swap/swap.go` around lines 774 - 777, The Boltz websocket created in
submarineSwap via h.boltzSvc.NewWebsocket() and ws.ConnectAndSubscribe(...) is
never closed; ensure you call ws.Close() so the reader/ping goroutines and
remote subscription are torn down: after a successful ConnectAndSubscribe,
immediately defer ws.Close() inside submarineSwap so it runs on every return,
and if ConnectAndSubscribe returns an error call ws.Close() before returning the
error (or ensure Close is invoked in the error path); reference NewWebsocket(),
ws.ConnectAndSubscribe(...), ws.Close(), and the submarineSwap function when
making the change.
| func decodeInvoice(invoice string) (uint64, []byte, error) { | ||
| bolt11, err := decodepay.Decodepay(invoice) | ||
| if err != nil { | ||
| return 0, nil, err | ||
| } | ||
|
|
||
| amount := uint64(bolt11.MSatoshi / 1000) | ||
| preimageHash, err := hex.DecodeString(bolt11.PaymentHash) | ||
| if err != nil { | ||
| return 0, nil, err | ||
| } | ||
|
|
||
| return amount, input.Ripemd160H(preimageHash), nil |
There was a problem hiding this comment.
Reject invoices that are not a whole number of sats.
bolt11.MSatoshi / 1000 floors any leftover millisats. That means a non-sat invoice can be treated as the wrong sat amount in reverseSwap instead of failing cleanly.
Suggested fix
func decodeInvoice(invoice string) (uint64, []byte, error) {
bolt11, err := decodepay.Decodepay(invoice)
if err != nil {
return 0, nil, err
}
+ if bolt11.MSatoshi%1000 != 0 {
+ return 0, nil, fmt.Errorf("invoice amount %d msat is not a whole sat", bolt11.MSatoshi)
+ }
amount := uint64(bolt11.MSatoshi / 1000)
preimageHash, err := hex.DecodeString(bolt11.PaymentHash)
if err != nil {
return 0, nil, err
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func decodeInvoice(invoice string) (uint64, []byte, error) { | |
| bolt11, err := decodepay.Decodepay(invoice) | |
| if err != nil { | |
| return 0, nil, err | |
| } | |
| amount := uint64(bolt11.MSatoshi / 1000) | |
| preimageHash, err := hex.DecodeString(bolt11.PaymentHash) | |
| if err != nil { | |
| return 0, nil, err | |
| } | |
| return amount, input.Ripemd160H(preimageHash), nil | |
| func decodeInvoice(invoice string) (uint64, []byte, error) { | |
| bolt11, err := decodepay.Decodepay(invoice) | |
| if err != nil { | |
| return 0, nil, err | |
| } | |
| if bolt11.MSatoshi%1000 != 0 { | |
| return 0, nil, fmt.Errorf("invoice amount %d msat is not a whole sat", bolt11.MSatoshi) | |
| } | |
| amount := uint64(bolt11.MSatoshi / 1000) | |
| preimageHash, err := hex.DecodeString(bolt11.PaymentHash) | |
| if err != nil { | |
| return 0, nil, err | |
| } | |
| return amount, input.Ripemd160H(preimageHash), nil | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@swap/utils.go` around lines 244 - 256, In decodeInvoice, detect and reject
invoices where bolt11.MSatoshi is not an exact multiple of 1000 (i.e.
bolt11.MSatoshi % 1000 != 0) so non-whole-satoshi invoices fail instead of being
floored; update decodeInvoice to return an error when that condition is met
(preserving the existing return types) before computing amount :=
uint64(bolt11.MSatoshi / 1000) and the preimage hash (this prevents reverseSwap
from receiving an incorrect sat amount).
There was a problem hiding this comment.
Actionable comments posted: 9
♻️ Duplicate comments (2)
swap/chainswap_btc_ark_handler.go (1)
286-291:⚠️ Potential issue | 🟠 MajorReject non-32-byte claim hashes before signing.
copy(msg[:], txHashBytes)silently zero-pads or truncates malformedTransactionHashvalues, so this path can produce a partial signature over the wrong message.Suggested fix
txHashBytes, err := hex.DecodeString(claimDetails.TransactionHash) if err != nil { return fmt.Errorf("decode transaction hash: %w", err) } + if len(txHashBytes) != 32 { + return fmt.Errorf( + "invalid transaction hash length %d for %q", + len(txHashBytes), + claimDetails.TransactionHash, + ) + } var msg [32]byte copy(msg[:], txHashBytes)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/chainswap_btc_ark_handler.go` around lines 286 - 291, The code currently decodes claimDetails.TransactionHash into txHashBytes and blindly does copy(msg[:], txHashBytes) which will silently zero-pad or truncate non-32-byte inputs; instead validate the decoded length first and return an error if len(txHashBytes) != 32 before constructing msg and signing. Specifically, in the block that handles txHashBytes (use the symbols claimDetails.TransactionHash, txHashBytes, msg [32]byte and the copy call), add a length check and fail fast with a clear error when the decoded hash is not exactly 32 bytes so you never sign a partially padded or truncated message.swap/boltz/ws.go (1)
42-43:⚠️ Potential issue | 🟠 MajorBuffer subscription acks so the reader can’t deadlock.
subscriptionsis still unbuffered. If a"subscribe"ack arrives after the 5-second wait has already timed out—or when nosubscribe()call is currently waiting—the reader goroutine blocks forever on the send, and no more updates or reconnect handling can run.Suggested fix
type Websocket struct { Updates chan SwapUpdate apiUrl string - subscriptions chan bool + subscriptions chan struct{} mu sync.RWMutex conn *websocket.Conn closed bool @@ return &Websocket{ apiUrl: boltz.WSURL, - subscriptions: make(chan bool), + subscriptions: make(chan struct{}, 1), dialer: &dialer, Updates: make(chan SwapUpdate), } } @@ case "subscribe": - boltz.subscriptions <- true + select { + case boltz.subscriptions <- struct{}{}: + default: + } continueAlso applies to: 67-70, 166-168, 226-230
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@swap/boltz/ws.go` around lines 42 - 43, The subscriptions channel is unbuffered which lets the reader goroutine block forever when it tries to send an ack if no subscribe() caller is waiting; change the channel to a buffered channel (e.g., capacity 1) where it is declared/initialized (the subscriptions variable and any make(chan bool) instances referenced in the file) so the reader can send an ack without blocking, and apply the same buffer change to the other similar channels found in the file (the other channel declarations/initializations mentioned around the other ranges) to prevent deadlocks in the reader goroutine.
🧹 Nitpick comments (1)
test/infra/fulmine.Dockerfile (1)
13-14: Consider additional build optimizations for smaller, more secure images.The build configuration is solid, but you could consider these optional enhancements:
- Add
-trimpathto remove local path information from the binary- Add
-s -wto strip debug symbols and reduce binary size- Add
LABELdirectives for image metadata (version, commit, build date)- Add
HEALTHCHECKfor better container orchestration♻️ Optional build and metadata improvements
# Build without web assets (not needed for boltz-fulmine sidecar) RUN CGO_ENABLED=0 GOOS=${TARGETOS} GOARCH=${TARGETARCH} \ - go build -ldflags="-X 'main.version=${VERSION}'" -o bin/fulmine cmd/fulmine/main.go + go build -trimpath -ldflags="-s -w -X 'main.version=${VERSION}'" -o bin/fulmine cmd/fulmine/main.go # Final image FROM alpine:3.20 +LABEL org.opencontainers.image.version="${VERSION}" \ + org.opencontainers.image.source="https://github.com/ArkLabsHQ/fulmine" \ + org.opencontainers.image.title="Boltz Fulmine Sidecar" + WORKDIR /app COPY --from=go-builder /app/bin/* /app/ ENV PATH="/app:${PATH}" ENV FULMINE_DATADIR=/app/data VOLUME /app/data +HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ + CMD fulmine healthcheck || exit 1 + ENTRYPOINT [ "fulmine" ]Note: Verify that the
fulminebinary supports ahealthchecksubcommand before adding the HEALTHCHECK directive.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/fulmine.Dockerfile` around lines 13 - 14, Update the Dockerfile RUN build step that compiles cmd/fulmine/main.go to add Go linker flags and build flags for smaller binaries (add -trimpath and -ldflags="-s -w -X 'main.version=${VERSION}'" or incorporate -s -w into the existing -ldflags) and keep CGO_ENABLED/GOOS/GOARCH usage as-is; also add image LABEL directives exposing version/commit/build date (use ${VERSION}, commit hash and build date build args) and, if the fulmine binary implements a healthcheck subcommand, add a HEALTHCHECK directive invoking the binary (fallback to a simple exit 0/1 check if not available) to improve metadata and runtime observability.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@swap/boltz/ws.go`:
- Around line 75-79: Connect() and the reader/reconnect loops reopen the
websocket even after Close() sets boltz.closed = true; before any dial or retry
sleep return if boltz.closed is set. Modify Websocket.Connect(), the reader
goroutine's reconnect/backoff code, and any retry loops to check boltz.closed
(under the same mutex/atomic used by Close()) before attempting url.Parse/dial
or sleeping for backoff, and bail out cleanly instead of retrying; use the
existing Close(), boltz.closed and the same synchronization primitive used
elsewhere to avoid races.
In `@test/e2e/swap_test.go`:
- Around line 48-53: The shared events slice is mutated by the async
swap-monitor goroutine via eventCallback (func(event swap.ChainSwapEvent)) and
later read by the test goroutine, causing a data race; protect access by
serializing modifies/reads (e.g., add a sync.Mutex or use a channel) around the
events slice: wrap all append operations in eventCallback and all test-side
reads/assertions with the same mutex.lock/unlock (or send events to a buffered
channel and collect from it in the test), and apply the same protection to the
other similar callback sites mentioned so all accesses to the events slice are
synchronized.
In `@test/e2e/utils_test.go`:
- Around line 243-259: The helper setMockBoltzConfig (and the other mock-boltz
helpers in this file) use http.DefaultClient/NewRequest with no context/timeout
and proceed to read/parse responses without first verifying StatusCode; update
setMockBoltzConfig to create a context with a short timeout (e.g.,
context.WithTimeout), build the request with http.NewRequestWithContext, use an
http.Client with a timeout or the context, perform the request, immediately
check resp.StatusCode and return/fail with the resp body included in the error
message if it's not 200 OK, then only read/parse the body on success and always
defer resp.Body.Close(); apply the same pattern to the other helpers referenced
(the GET/POST mock-boltz helpers in the file).
In `@test/infra/fulmine.Dockerfile`:
- Around line 4-7: Add the missing build args to the boltz-fulmine service so
VERSION, TARGETOS, TARGETARCH and FULMINE_BRANCH are passed into the Docker
build; specifically update the service's build block in docker-compose.yml to
provide args for VERSION (e.g. "test-infra" or a real version string), TARGETOS
(e.g. "linux"), TARGETARCH (e.g. "amd64") and FULMINE_BRANCH (pin to a
branch/tag instead of defaulting to "master") so the ARGs declared in the
fulmine.Dockerfile (VERSION, TARGETOS, TARGETARCH, FULMINE_BRANCH) are populated
during image build and the binary contains version info.
- Around line 17-28: The image currently runs as root; create a dedicated
non-root user and switch to it: add steps to create a user/group (e.g. fulmine
user), ensure ownership of the FULMINE_DATADIR (/app/data) and any copied
binaries is set to that user, and add a USER directive before the ENTRYPOINT so
the container runs unprivileged; update references around ENV FULMINE_DATADIR
and ENTRYPOINT "fulmine" to ensure the new user can execute the binary and write
to the data directory.
- Around line 7-10: The Dockerfile currently uses ARG FULMINE_BRANCH and runs
"git clone ... --branch ${FULMINE_BRANCH}" which leads to non-reproducible
builds; change the build arg to something like FULMINE_COMMIT and update the RUN
step that performs the clone (the RUN git clone ... line) so it fetches and
checks out a specific commit/tag (use a minimal fetch and git checkout of
${FULMINE_COMMIT} or otherwise fetch by commit hash) instead of cloning a moving
branch; also update any consumers (e.g., docker-compose build args) to pass the
pinned FULMINE_COMMIT value.
In `@test/infra/mockboltz/Dockerfile`:
- Around line 20-25: The final Docker stage runs as root; create and switch to a
non-root user and ensure the binary is owned and executable by that user before
switching. In the Dockerfile change the runtime stage that contains EXPOSE 9001
and ENTRYPOINT ["/app/mock-boltz"] to add a user (e.g., addgroup/adduser or
adduser -D), chown /app/mock-boltz and any required dirs to that user, set
appropriate file permissions, and add a USER <username> line before the
ENTRYPOINT so the container no longer runs as root.
In `@test/infra/mockboltz/server.go`:
- Around line 591-609: handleWS is modifying client.subs without locking while
pushSwapUpdate reads it, causing a concurrent map panic; wrap all writes to
client.subs with the client's mutex (client.mu) the same way reads are
protected. Specifically, in the subscribe branch where you iterate msg.Args and
set client.subs[id] = struct{}{} (and the similar block at the other location
mentioned), acquire client.mu.Lock() before mutating client.subs and defer or
unlock after the updates, keeping the existing conn.WriteJSON ack calls
protected as needed to avoid races between subscription changes and
pushSwapUpdate.
- Around line 276-280: The goroutine uses an undefined logger reference
s.log.Printf which breaks compilation; replace that call with the mock server's
test logger (s.t.Logf) so the error is logged via the existing testing.T field,
i.e. in the anonymous func that calls s.pushSwapUpdate change s.log.Printf(...)
to s.t.Logf(...); if the Server struct actually uses a different logger field
(e.g. s.logger), use that field's Printf method instead.
---
Duplicate comments:
In `@swap/boltz/ws.go`:
- Around line 42-43: The subscriptions channel is unbuffered which lets the
reader goroutine block forever when it tries to send an ack if no subscribe()
caller is waiting; change the channel to a buffered channel (e.g., capacity 1)
where it is declared/initialized (the subscriptions variable and any make(chan
bool) instances referenced in the file) so the reader can send an ack without
blocking, and apply the same buffer change to the other similar channels found
in the file (the other channel declarations/initializations mentioned around the
other ranges) to prevent deadlocks in the reader goroutine.
In `@swap/chainswap_btc_ark_handler.go`:
- Around line 286-291: The code currently decodes claimDetails.TransactionHash
into txHashBytes and blindly does copy(msg[:], txHashBytes) which will silently
zero-pad or truncate non-32-byte inputs; instead validate the decoded length
first and return an error if len(txHashBytes) != 32 before constructing msg and
signing. Specifically, in the block that handles txHashBytes (use the symbols
claimDetails.TransactionHash, txHashBytes, msg [32]byte and the copy call), add
a length check and fail fast with a clear error when the decoded hash is not
exactly 32 bytes so you never sign a partially padded or truncated message.
---
Nitpick comments:
In `@test/infra/fulmine.Dockerfile`:
- Around line 13-14: Update the Dockerfile RUN build step that compiles
cmd/fulmine/main.go to add Go linker flags and build flags for smaller binaries
(add -trimpath and -ldflags="-s -w -X 'main.version=${VERSION}'" or incorporate
-s -w into the existing -ldflags) and keep CGO_ENABLED/GOOS/GOARCH usage as-is;
also add image LABEL directives exposing version/commit/build date (use
${VERSION}, commit hash and build date build args) and, if the fulmine binary
implements a healthcheck subcommand, add a HEALTHCHECK directive invoking the
binary (fallback to a simple exit 0/1 check if not available) to improve
metadata and runtime observability.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b1a26417-a352-4c40-92ef-e191a130e007
📒 Files selected for processing (11)
swap/boltz/ws.goswap/chainswap_arkt_btc_handler.goswap/chainswap_btc_ark_handler.goswap/swap.gotest/e2e/swap_test.gotest/e2e/utils_test.gotest/infra/arkd.Dockerfiletest/infra/fulmine.Dockerfiletest/infra/mockboltz/Dockerfiletest/infra/mockboltz/server.gotest/infra/wallet.Dockerfile
✅ Files skipped from review due to trivial changes (2)
- test/infra/arkd.Dockerfile
- test/infra/wallet.Dockerfile
🚧 Files skipped from review as they are similar to previous changes (2)
- swap/chainswap_arkt_btc_handler.go
- swap/swap.go
| func (boltz *Websocket) Connect() error { | ||
| wsUrl, err := url.Parse(boltz.apiUrl) | ||
| if err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
Stop reconnecting once the websocket has been closed.
Close() flips closed = true, but neither Connect() nor the reconnect loop checks that flag before dialing again. If Close() happens while the reader goroutine is already in the backoff path, this code will reopen a client the caller explicitly shut down.
Suggested fix
func (boltz *Websocket) Connect() error {
+ boltz.mu.RLock()
+ closed := boltz.closed
+ boltz.mu.RUnlock()
+ if closed {
+ return errors.New("websocket is closed")
+ }
+
wsUrl, err := url.Parse(boltz.apiUrl)
if err != nil {
return err
}
@@
for {
pingTicker.Stop()
boltz.mu.Lock()
+ closed := boltz.closed
reconnect := boltz.reconnect
if reconnect {
boltz.reconnect = false
}
boltz.mu.Unlock()
- if reconnect {
+ if closed || reconnect {
return
} else {
time.Sleep(reconnectInterval)
}
err := boltz.Connect()Also applies to: 173-189, 258-281
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@swap/boltz/ws.go` around lines 75 - 79, Connect() and the reader/reconnect
loops reopen the websocket even after Close() sets boltz.closed = true; before
any dial or retry sleep return if boltz.closed is set. Modify
Websocket.Connect(), the reader goroutine's reconnect/backoff code, and any
retry loops to check boltz.closed (under the same mutex/atomic used by Close())
before attempting url.Parse/dial or sleeping for backoff, and bail out cleanly
instead of retrying; use the existing Close(), boltz.closed and the same
synchronization primitive used elsewhere to avoid races.
| // Track swap events to verify the correct state machine transitions | ||
| var events []swap.ChainSwapEvent | ||
| eventCallback := func(event swap.ChainSwapEvent) { | ||
| events = append(events, event) | ||
| t.Logf("ChainSwap event: %T", event) | ||
| } |
There was a problem hiding this comment.
Protect the shared events slices.
Each of these callbacks appends to a slice from an async swap-monitor goroutine, while the test goroutine later reads the same slice for assertions. That is a real data race.
Suggested fix
- var events []swap.ChainSwapEvent
+ var (
+ eventsMu sync.Mutex
+ events []swap.ChainSwapEvent
+ )
eventCallback := func(event swap.ChainSwapEvent) {
+ eventsMu.Lock()
events = append(events, event)
+ eventsMu.Unlock()
t.Logf("ChainSwap event: %T", event)
}
@@
- for _, e := range events {
+ eventsMu.Lock()
+ snapshot := append([]swap.ChainSwapEvent(nil), events...)
+ eventsMu.Unlock()
+ for _, e := range snapshot {
switch e.(type) {Also applies to: 148-152, 368-372, 457-460, 558-561, 657-660, 761-764, 860-863
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/swap_test.go` around lines 48 - 53, The shared events slice is
mutated by the async swap-monitor goroutine via eventCallback (func(event
swap.ChainSwapEvent)) and later read by the test goroutine, causing a data race;
protect access by serializing modifies/reads (e.g., add a sync.Mutex or use a
channel) around the events slice: wrap all append operations in eventCallback
and all test-side reads/assertions with the same mutex.lock/unlock (or send
events to a buffered channel and collect from it in the test), and apply the
same protection to the other similar callback sites mentioned so all accesses to
the events slice are synchronized.
| func setMockBoltzConfig(t *testing.T, cfg map[string]any) { | ||
| t.Helper() | ||
| body, err := json.Marshal(cfg) | ||
| require.NoError(t, err) | ||
|
|
||
| req, err := http.NewRequest("POST", mockBoltzAdminURL+"/admin/config", bytes.NewReader(body)) | ||
| require.NoError(t, err) | ||
| req.Header.Set("Content-Type", "application/json") | ||
|
|
||
| resp, err := http.DefaultClient.Do(req) | ||
| require.NoError(t, err) | ||
| defer func() { _ = resp.Body.Close() }() | ||
|
|
||
| respBody, _ := io.ReadAll(resp.Body) | ||
| require.Equalf(t, http.StatusOK, resp.StatusCode, | ||
| "setMockBoltzConfig failed: %s", string(respBody)) | ||
| } |
There was a problem hiding this comment.
Bound and validate the mock-boltz admin requests.
These helpers use http.DefaultClient/http.Get with no timeout or request context, and the read helpers decode bodies without checking for 200 OK. A stalled or erroring mock service can hang the suite or return zero-value state that looks valid to the caller.
Suggested fix
const mockBoltzAdminURL = "http://127.0.0.1:9101"
+
+var mockBoltzHTTPClient = &http.Client{Timeout: 5 * time.Second}
@@
- req, err := http.NewRequest("POST", mockBoltzAdminURL+"/admin/config", bytes.NewReader(body))
+ req, err := http.NewRequestWithContext(
+ t.Context(),
+ http.MethodPost,
+ mockBoltzAdminURL+"/admin/config",
+ bytes.NewReader(body),
+ )
require.NoError(t, err)
@@
- resp, err := http.DefaultClient.Do(req)
+ resp, err := mockBoltzHTTPClient.Do(req)
require.NoError(t, err)
@@
- url := fmt.Sprintf("%s/admin/swaps/%s", mockBoltzAdminURL, swapID)
- resp, err := http.Get(url) //nolint:gosec
+ url := fmt.Sprintf("%s/admin/swaps/%s", mockBoltzAdminURL, swapID)
+ req, err := http.NewRequestWithContext(t.Context(), http.MethodGet, url, nil)
+ require.NoError(t, err)
+ resp, err := mockBoltzHTTPClient.Do(req)
require.NoError(t, err)
defer func() { _ = resp.Body.Close() }()
+ require.Equalf(t, http.StatusOK, resp.StatusCode, "getMockBoltzSwap(%s) failed", swapID)Also applies to: 262-272, 276-293, 296-307, 413-434, 446-457
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/utils_test.go` around lines 243 - 259, The helper setMockBoltzConfig
(and the other mock-boltz helpers in this file) use
http.DefaultClient/NewRequest with no context/timeout and proceed to read/parse
responses without first verifying StatusCode; update setMockBoltzConfig to
create a context with a short timeout (e.g., context.WithTimeout), build the
request with http.NewRequestWithContext, use an http.Client with a timeout or
the context, perform the request, immediately check resp.StatusCode and
return/fail with the resp body included in the error message if it's not 200 OK,
then only read/parse the body on success and always defer resp.Body.Close();
apply the same pattern to the other helpers referenced (the GET/POST mock-boltz
helpers in the file).
| ARG VERSION | ||
| ARG TARGETOS | ||
| ARG TARGETARCH | ||
| ARG FULMINE_BRANCH=master |
There was a problem hiding this comment.
Missing build arguments in docker-compose.yml results in empty VERSION and default values.
The docker-compose.yml configuration does not pass any of the declared build args (VERSION, TARGETOS, TARGETARCH, FULMINE_BRANCH). This means:
VERSIONwill be empty, resulting in no version information in the binary (line 14)FULMINE_BRANCHwill default tomasterTARGETOSandTARGETARCHwill be unset (Docker will use host platform defaults)
The empty VERSION impacts observability and debugging since the binary won't report its version.
🔧 Proposed fix to add build args in docker-compose.yml
In test/infra/docker-compose.yml, update the boltz-fulmine service to include build args:
boltz-fulmine:
build:
context: .
dockerfile: fulmine.Dockerfile
args:
VERSION: "test-infra"
TARGETOS: linux
TARGETARCH: amd64
FULMINE_BRANCH: master # or pin to a specific commit/tag🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/infra/fulmine.Dockerfile` around lines 4 - 7, Add the missing build args
to the boltz-fulmine service so VERSION, TARGETOS, TARGETARCH and FULMINE_BRANCH
are passed into the Docker build; specifically update the service's build block
in docker-compose.yml to provide args for VERSION (e.g. "test-infra" or a real
version string), TARGETOS (e.g. "linux"), TARGETARCH (e.g. "amd64") and
FULMINE_BRANCH (pin to a branch/tag instead of defaulting to "master") so the
ARGs declared in the fulmine.Dockerfile (VERSION, TARGETOS, TARGETARCH,
FULMINE_BRANCH) are populated during image build and the binary contains version
info.
| ARG FULMINE_BRANCH=master | ||
|
|
||
| WORKDIR /app | ||
| RUN git clone --branch ${FULMINE_BRANCH} --single-branch --depth 1 https://github.com/ArkLabsHQ/fulmine.git . |
There was a problem hiding this comment.
Non-reproducible builds due to default branch reference.
The Dockerfile defaults to cloning the master branch, which means each build can pull different code as the upstream repository evolves. This creates several risks:
- Build reproducibility: Integration tests may break unexpectedly when upstream changes
- Supply chain security: No verification of source integrity (commit hash, signatures)
- Debugging difficulty: Hard to correlate test failures with specific fulmine versions
For test infrastructure that requires stability, consider pinning to a specific commit hash or tagged version.
🔒 Proposed fix to pin to a specific commit
-ARG FULMINE_BRANCH=master
+ARG FULMINE_COMMIT=abc123def456 # Replace with actual commit hash
WORKDIR /app
-RUN git clone --branch ${FULMINE_BRANCH} --single-branch --depth 1 https://github.com/ArkLabsHQ/fulmine.git .
+RUN git clone https://github.com/ArkLabsHQ/fulmine.git . && \
+ git checkout ${FULMINE_COMMIT}Then in docker-compose.yml:
build:
args:
FULMINE_COMMIT: "abc123def456" # Pin to tested version🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/infra/fulmine.Dockerfile` around lines 7 - 10, The Dockerfile currently
uses ARG FULMINE_BRANCH and runs "git clone ... --branch ${FULMINE_BRANCH}"
which leads to non-reproducible builds; change the build arg to something like
FULMINE_COMMIT and update the RUN step that performs the clone (the RUN git
clone ... line) so it fetches and checks out a specific commit/tag (use a
minimal fetch and git checkout of ${FULMINE_COMMIT} or otherwise fetch by commit
hash) instead of cloning a moving branch; also update any consumers (e.g.,
docker-compose build args) to pass the pinned FULMINE_COMMIT value.
| FROM alpine:3.20 | ||
|
|
||
| WORKDIR /app | ||
|
|
||
| COPY --from=go-builder /app/bin/* /app/ | ||
|
|
||
| ENV PATH="/app:${PATH}" | ||
| ENV FULMINE_DATADIR=/app/data | ||
|
|
||
| VOLUME /app/data | ||
|
|
||
| ENTRYPOINT [ "fulmine" ] |
There was a problem hiding this comment.
Container runs as root user.
The final image does not specify a non-root user, so the container will run as root. While this is test infrastructure, following the principle of least privilege improves security posture.
🛡️ Proposed fix to add non-root user
FROM alpine:3.20
+RUN addgroup -g 1000 fulmine && \
+ adduser -D -u 1000 -G fulmine fulmine && \
+ mkdir -p /app/data && \
+ chown -R fulmine:fulmine /app
+
WORKDIR /app
COPY --from=go-builder /app/bin/* /app/
ENV PATH="/app:${PATH}"
ENV FULMINE_DATADIR=/app/data
VOLUME /app/data
+USER fulmine
+
ENTRYPOINT [ "fulmine" ]As per static analysis: Trivy DS-0002 flags that image user should not be 'root'.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| FROM alpine:3.20 | |
| WORKDIR /app | |
| COPY --from=go-builder /app/bin/* /app/ | |
| ENV PATH="/app:${PATH}" | |
| ENV FULMINE_DATADIR=/app/data | |
| VOLUME /app/data | |
| ENTRYPOINT [ "fulmine" ] | |
| FROM alpine:3.20 | |
| RUN addgroup -g 1000 fulmine && \ | |
| adduser -D -u 1000 -G fulmine fulmine && \ | |
| mkdir -p /app/data && \ | |
| chown -R fulmine:fulmine /app | |
| WORKDIR /app | |
| COPY --from=go-builder /app/bin/* /app/ | |
| ENV PATH="/app:${PATH}" | |
| ENV FULMINE_DATADIR=/app/data | |
| VOLUME /app/data | |
| USER fulmine | |
| ENTRYPOINT [ "fulmine" ] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/infra/fulmine.Dockerfile` around lines 17 - 28, The image currently runs
as root; create a dedicated non-root user and switch to it: add steps to create
a user/group (e.g. fulmine user), ensure ownership of the FULMINE_DATADIR
(/app/data) and any copied binaries is set to that user, and add a USER
directive before the ENTRYPOINT so the container runs unprivileged; update
references around ENV FULMINE_DATADIR and ENTRYPOINT "fulmine" to ensure the new
user can execute the binary and write to the data directory.
| FROM alpine:3.21 | ||
| RUN apk add --no-cache ca-certificates tzdata | ||
| WORKDIR /app | ||
| COPY --from=builder /out/mock-boltz /app/mock-boltz | ||
| EXPOSE 9001 | ||
| ENTRYPOINT ["/app/mock-boltz"] |
There was a problem hiding this comment.
Drop root from the runtime image.
The final stage still runs as root. This service only binds to port 9001 and reads a single binary, so keeping root here is an unnecessary privilege expansion.
Suggested fix
FROM alpine:3.21
-RUN apk add --no-cache ca-certificates tzdata
+RUN apk add --no-cache ca-certificates tzdata \
+ && addgroup -S mockboltz \
+ && adduser -S -G mockboltz mockboltz
WORKDIR /app
COPY --from=builder /out/mock-boltz /app/mock-boltz
+USER mockboltz
EXPOSE 9001
ENTRYPOINT ["/app/mock-boltz"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/infra/mockboltz/Dockerfile` around lines 20 - 25, The final Docker stage
runs as root; create and switch to a non-root user and ensure the binary is
owned and executable by that user before switching. In the Dockerfile change the
runtime stage that contains EXPOSE 9001 and ENTRYPOINT ["/app/mock-boltz"] to
add a user (e.g., addgroup/adduser or adduser -D), chown /app/mock-boltz and any
required dirs to that user, set appropriate file permissions, and add a USER
<username> line before the ENTRYPOINT so the container no longer runs as root.
| if msg.Op == "subscribe" && msg.Channel == "swap.update" { | ||
| s.wsMu.RLock() | ||
| client := s.wsClients[conn] | ||
| s.wsMu.RUnlock() | ||
| if client == nil { | ||
| continue | ||
| } | ||
| for _, id := range msg.Args { | ||
| client.subs[id] = struct{}{} | ||
| } | ||
|
|
||
| ack := map[string]any{ | ||
| "event": "subscribe", | ||
| "channel": "swap.update", | ||
| "args": msg.Args, | ||
| } | ||
| client.mu.Lock() | ||
| _ = conn.WriteJSON(ack) | ||
| client.mu.Unlock() |
There was a problem hiding this comment.
Guard client.subs with the client mutex.
handleWS() writes client.subs without locking, while pushSwapUpdate() reads the same map concurrently. That can panic with concurrent map read and map write հենց when a subscription races with a pushed update.
Suggested fix
if msg.Op == "subscribe" && msg.Channel == "swap.update" {
s.wsMu.RLock()
client := s.wsClients[conn]
s.wsMu.RUnlock()
if client == nil {
continue
}
+ client.mu.Lock()
for _, id := range msg.Args {
client.subs[id] = struct{}{}
}
ack := map[string]any{
"event": "subscribe",
"channel": "swap.update",
"args": msg.Args,
}
- client.mu.Lock()
_ = conn.WriteJSON(ack)
client.mu.Unlock()
}
}
@@
for conn, client := range s.wsClients {
- if _, ok := client.subs[id]; !ok {
- continue
- }
client.mu.Lock()
- err := conn.WriteJSON(payload)
+ _, subscribed := client.subs[id]
+ var err error
+ if subscribed {
+ err = conn.WriteJSON(payload)
+ }
client.mu.Unlock()
if err != nil {
log.WithError(err).Warn("failed to push ws event")
}
}Also applies to: 947-953
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/infra/mockboltz/server.go` around lines 591 - 609, handleWS is modifying
client.subs without locking while pushSwapUpdate reads it, causing a concurrent
map panic; wrap all writes to client.subs with the client's mutex (client.mu)
the same way reads are protected. Specifically, in the subscribe branch where
you iterate msg.Args and set client.subs[id] = struct{}{} (and the similar block
at the other location mentioned), acquire client.mu.Lock() before mutating
client.subs and defer or unlock after the updates, keeping the existing
conn.WriteJSON ack calls protected as needed to avoid races between subscription
changes and pushSwapUpdate.
|
Followup review (commit Single-line linting fix in |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/infra/mockboltz/server.go (2)
1108-1113: Document the 512-threshold heuristic for locktime type.The cutoff at 512 is a simplification. Real BIP-68 uses a flag bit, not a value threshold. Consider adding a brief comment so future readers understand the mock's behavior.
Suggested improvement
+// parseRelativeLocktime uses a simple heuristic: values >= 512 are treated as +// seconds, otherwise as blocks. This is a mock simplification; real BIP-68 +// uses a flag bit to distinguish the two types. func parseRelativeLocktime(locktime uint32) arklib.RelativeLocktime { if locktime >= 512 { return arklib.RelativeLocktime{Type: arklib.LocktimeTypeSecond, Value: locktime} } return arklib.RelativeLocktime{Type: arklib.LocktimeTypeBlock, Value: locktime} }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/mockboltz/server.go` around lines 1108 - 1113, The function parseRelativeLocktime uses a simplified heuristic (locktime >= 512 => second-based) which differs from BIP-68 that inspects the high bit to determine time-vs-block semantics; add a concise comment above parseRelativeLocktime explaining this: state that the 512 cutoff is an intentional mock simplification of BIP-68's flag-bit behavior, mention the real rule (use the high bit to select between LocktimeTypeSecond and LocktimeTypeBlock), and note that the function returns arklib.RelativeLocktime with Type set to arklib.LocktimeTypeSecond or arklib.LocktimeTypeBlock accordingly so future readers understand the deviation.
449-454: Line exceeds golines max length.Static analysis reports this line is not properly formatted per golines rules.
Suggested fix
filtered := make([]*psbt.TaprootScriptSpendSig, 0, len(in.TaprootScriptSpendSig)+1) for _, existing := range in.TaprootScriptSpendSig { - if bytes.Equal(existing.XOnlyPubKey, xOnly) && bytes.Equal(existing.LeafHash, leafHash[:]) { + sameKey := bytes.Equal(existing.XOnlyPubKey, xOnly) + sameLeaf := bytes.Equal(existing.LeafHash, leafHash[:]) + if sameKey && sameLeaf { continue } filtered = append(filtered, existing)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/mockboltz/server.go` around lines 449 - 454, The line creating filtered is too long; split the make call so it fits golines (e.g., compute the capacity into a short variable or use a var declaration) — replace "filtered := make([]*psbt.TaprootScriptSpendSig, 0, len(in.TaprootScriptSpendSig)+1)" with a multi-line form such as "cap := len(in.TaprootScriptSpendSig) + 1" followed by "filtered := make([]*psbt.TaprootScriptSpendSig, 0, cap)" or "var filtered []*psbt.TaprootScriptSpendSig; filtered = make([]*psbt.TaprootScriptSpendSig, 0, len(in.TaprootScriptSpendSig)+1)" ensuring identifiers filtered, psbt.TaprootScriptSpendSig and in.TaprootScriptSpendSig are used as shown.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/infra/mockboltz/server.go`:
- Around line 1108-1113: The function parseRelativeLocktime uses a simplified
heuristic (locktime >= 512 => second-based) which differs from BIP-68 that
inspects the high bit to determine time-vs-block semantics; add a concise
comment above parseRelativeLocktime explaining this: state that the 512 cutoff
is an intentional mock simplification of BIP-68's flag-bit behavior, mention the
real rule (use the high bit to select between LocktimeTypeSecond and
LocktimeTypeBlock), and note that the function returns arklib.RelativeLocktime
with Type set to arklib.LocktimeTypeSecond or arklib.LocktimeTypeBlock
accordingly so future readers understand the deviation.
- Around line 449-454: The line creating filtered is too long; split the make
call so it fits golines (e.g., compute the capacity into a short variable or use
a var declaration) — replace "filtered := make([]*psbt.TaprootScriptSpendSig, 0,
len(in.TaprootScriptSpendSig)+1)" with a multi-line form such as "cap :=
len(in.TaprootScriptSpendSig) + 1" followed by "filtered :=
make([]*psbt.TaprootScriptSpendSig, 0, cap)" or "var filtered
[]*psbt.TaprootScriptSpendSig; filtered = make([]*psbt.TaprootScriptSpendSig, 0,
len(in.TaprootScriptSpendSig)+1)" ensuring identifiers filtered,
psbt.TaprootScriptSpendSig and in.TaprootScriptSpendSig are used as shown.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b4e71f37-badf-446f-a586-b7469c8e9448
📒 Files selected for processing (1)
test/infra/mockboltz/server.go
🔍 Arkana PR Review —
|
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
test/infra/mockboltz/server.go (1)
592-601:⚠️ Potential issue | 🔴 CriticalLock
client.subson both the subscribe and broadcast paths.At Lines 599-600
handleWS()writesclient.subswithoutclient.mu, whilepushSwapUpdate()reads the same map at Lines 977-978. That can still panic withconcurrent map read and map writewhen a subscribe races with an injected update.Suggested fix
if msg.Op == "subscribe" && msg.Channel == "swap.update" { s.wsMu.RLock() client := s.wsClients[conn] s.wsMu.RUnlock() if client == nil { continue } + client.mu.Lock() for _, id := range msg.Args { client.subs[id] = struct{}{} } ack := map[string]any{ "event": "subscribe", "channel": "swap.update", "args": msg.Args, } - client.mu.Lock() _ = conn.WriteJSON(ack) client.mu.Unlock() } @@ for conn, client := range s.wsClients { - if _, ok := client.subs[id]; !ok { - continue - } client.mu.Lock() - err := conn.WriteJSON(payload) + _, subscribed := client.subs[id] + var err error + if subscribed { + err = conn.WriteJSON(payload) + } client.mu.Unlock() if err != nil { log.WithError(err).Warn("failed to push ws event") } }Also applies to: 976-982
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/mockboltz/server.go` around lines 592 - 601, handleWS() is writing to the client's subscription map (client.subs) without taking client.mu while pushSwapUpdate() reads that map, causing potential concurrent map read/write panics; update handleWS() (the subscribe branch where msg.Op=="subscribe" && msg.Channel=="swap.update") to acquire the per-client mutex (client.mu) before modifying client.subs and release it after, and likewise ensure pushSwapUpdate() wraps its reads of client.subs in the same client.mu; use the existing s.wsMu only for accessing s.wsClients[conn] and do not replace the per-client locking for subs access.test/e2e/swap_test.go (1)
48-53:⚠️ Potential issue | 🟠 MajorSynchronize each shared
eventscollector.At Lines 50-51, 157-158, 413-414, 509-510, 616-617, 723-724, and 853-854 the callback appends from an async goroutine, while the test goroutine later iterates/asserts on the same slice. That is a real data race and will make this e2e suite flaky under
-race.Suggested fix
- var events []swap.ChainSwapEvent + var ( + eventsMu sync.Mutex + events []swap.ChainSwapEvent + ) eventCallback := func(event swap.ChainSwapEvent) { + eventsMu.Lock() events = append(events, event) + eventsMu.Unlock() t.Logf("ChainSwap event: %T", event) } @@ - for _, e := range events { + eventsMu.Lock() + snapshot := append([]swap.ChainSwapEvent(nil), events...) + eventsMu.Unlock() + for _, e := range snapshot {Apply the same pattern to the other
eventsslices in this file.Also applies to: 156-160, 412-416, 508-512, 615-619, 722-726, 852-856
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/swap_test.go` around lines 48 - 53, The shared slice events (type swap.ChainSwapEvent) is appended from async callbacks (eventCallback) and later read from the test goroutine, causing a data race; protect access by introducing a sync.Mutex (e.g., eventsMu) next to each events slice and lock/unlock around every append in the callback and around any iteration/assertion reads (or make a locked copy before iterating), and apply this same mutex-guard pattern to the other events slices/callbacks in this file (the other occurrences referenced in the review).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e/swap_test.go`:
- Around line 549-567: The test currently treats reaching a terminal refund
state as success even if the unilateralRefundCalled channel never fired; change
the logic in the select that waits on unilateralRefundCalled so the test fails
if unilateralRefundCalled is not received within the timeout instead of
accepting terminal statuses as a fallback. Locate the select using the
unilateralRefundCalled channel and chainSwap.GetStatus() and remove or tighten
the switch that treats swap.ChainSwapRefundedUnilaterally / swap.ChainSwapFailed
/ swap.ChainSwapRefundFailed as success, ensuring the test explicitly requires
the unilateralRefundCalled case to occur (or fails on timeout) to prove the
callback path fires.
In `@test/infra/mockboltz/server.go`:
- Around line 1197-1208: The http.Get call used during Ark signer discovery (in
the New() flow of the mock server) has no timeout and can hang; replace the
plain http.Get(url) with a request that enforces a deadline—create a context
with timeout (e.g., context.WithTimeout) or use an http.Client with a Timeout
and perform client.Do(req) (or http.NewRequestWithContext) so each retry is
bounded, propagate/close the response body as before, and surface the
context/client error into lastErr for the retry loop.
---
Duplicate comments:
In `@test/e2e/swap_test.go`:
- Around line 48-53: The shared slice events (type swap.ChainSwapEvent) is
appended from async callbacks (eventCallback) and later read from the test
goroutine, causing a data race; protect access by introducing a sync.Mutex
(e.g., eventsMu) next to each events slice and lock/unlock around every append
in the callback and around any iteration/assertion reads (or make a locked copy
before iterating), and apply this same mutex-guard pattern to the other events
slices/callbacks in this file (the other occurrences referenced in the review).
In `@test/infra/mockboltz/server.go`:
- Around line 592-601: handleWS() is writing to the client's subscription map
(client.subs) without taking client.mu while pushSwapUpdate() reads that map,
causing potential concurrent map read/write panics; update handleWS() (the
subscribe branch where msg.Op=="subscribe" && msg.Channel=="swap.update") to
acquire the per-client mutex (client.mu) before modifying client.subs and
release it after, and likewise ensure pushSwapUpdate() wraps its reads of
client.subs in the same client.mu; use the existing s.wsMu only for accessing
s.wsClients[conn] and do not replace the per-client locking for subs access.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9ba7acdb-109c-4d4e-bc0e-c7c7311ee06f
📒 Files selected for processing (3)
test/e2e/swap_test.gotest/e2e/utils_test.gotest/infra/mockboltz/server.go
🚧 Files skipped from review as they are similar to previous changes (1)
- test/e2e/utils_test.go
| // Wait for the unilateral refund callback to fire | ||
| select { | ||
| case <-unilateralRefundCalled: | ||
| t.Logf("Unilateral refund callback confirmed for swap %s", chainSwap.Id) | ||
| require.NotEmpty(t, events, "should have received events") | ||
| case <-time.After(60 * time.Second): | ||
| switch chainSwap.GetStatus() { | ||
| case swap.ChainSwapRefundedUnilaterally, swap.ChainSwapFailed, swap.ChainSwapRefundFailed: | ||
| t.Logf( | ||
| "Swap reached terminal state %d (valid for unilateral refund test)", | ||
| chainSwap.GetStatus(), | ||
| ) | ||
| default: | ||
| t.Fatalf( | ||
| "timed out waiting for unilateral refund, swap status: %d", | ||
| chainSwap.GetStatus(), | ||
| ) | ||
| } | ||
| } |
There was a problem hiding this comment.
Keep the unilateral-refund callback assertion strict.
The fallback at Lines 554-566 currently passes once the swap reaches a terminal refund status, even if unilateralRefundCalled never fires. That means this test can succeed without proving the callback path it is named after.
Suggested fix
select {
case <-unilateralRefundCalled:
t.Logf("Unilateral refund callback confirmed for swap %s", chainSwap.Id)
require.NotEmpty(t, events, "should have received events")
case <-time.After(60 * time.Second):
- switch chainSwap.GetStatus() {
- case swap.ChainSwapRefundedUnilaterally, swap.ChainSwapFailed, swap.ChainSwapRefundFailed:
- t.Logf(
- "Swap reached terminal state %d (valid for unilateral refund test)",
- chainSwap.GetStatus(),
- )
- default:
- t.Fatalf(
- "timed out waiting for unilateral refund, swap status: %d",
- chainSwap.GetStatus(),
- )
- }
+ t.Fatalf(
+ "timed out waiting for unilateral refund callback, swap status: %d",
+ chainSwap.GetStatus(),
+ )
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Wait for the unilateral refund callback to fire | |
| select { | |
| case <-unilateralRefundCalled: | |
| t.Logf("Unilateral refund callback confirmed for swap %s", chainSwap.Id) | |
| require.NotEmpty(t, events, "should have received events") | |
| case <-time.After(60 * time.Second): | |
| switch chainSwap.GetStatus() { | |
| case swap.ChainSwapRefundedUnilaterally, swap.ChainSwapFailed, swap.ChainSwapRefundFailed: | |
| t.Logf( | |
| "Swap reached terminal state %d (valid for unilateral refund test)", | |
| chainSwap.GetStatus(), | |
| ) | |
| default: | |
| t.Fatalf( | |
| "timed out waiting for unilateral refund, swap status: %d", | |
| chainSwap.GetStatus(), | |
| ) | |
| } | |
| } | |
| // Wait for the unilateral refund callback to fire | |
| select { | |
| case <-unilateralRefundCalled: | |
| t.Logf("Unilateral refund callback confirmed for swap %s", chainSwap.Id) | |
| require.NotEmpty(t, events, "should have received events") | |
| case <-time.After(60 * time.Second): | |
| t.Fatalf( | |
| "timed out waiting for unilateral refund callback, swap status: %d", | |
| chainSwap.GetStatus(), | |
| ) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/swap_test.go` around lines 549 - 567, The test currently treats
reaching a terminal refund state as success even if the unilateralRefundCalled
channel never fired; change the logic in the select that waits on
unilateralRefundCalled so the test fails if unilateralRefundCalled is not
received within the timeout instead of accepting terminal statuses as a
fallback. Locate the select using the unilateralRefundCalled channel and
chainSwap.GetStatus() and remove or tighten the switch that treats
swap.ChainSwapRefundedUnilaterally / swap.ChainSwapFailed /
swap.ChainSwapRefundFailed as success, ensuring the test explicitly requires the
unilateralRefundCalled case to occur (or fails on timeout) to prove the callback
path fires.
| for i := 0; i < 30; i++ { | ||
| resp, err := http.Get(url) | ||
| if err != nil { | ||
| lastErr = err | ||
| time.Sleep(500 * time.Millisecond) | ||
| continue | ||
| } | ||
| body, _ := io.ReadAll(resp.Body) | ||
| _ = resp.Body.Close() | ||
| if resp.StatusCode != http.StatusOK { | ||
| lastErr = fmt.Errorf("http %d: %s", resp.StatusCode, strings.TrimSpace(string(body))) | ||
| time.Sleep(500 * time.Millisecond) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "server.go" -path "*/mockboltz/*" | head -20Repository: arkade-os/go-sdk
Length of output: 92
🏁 Script executed:
cat -n ./test/infra/mockboltz/server.go | sed -n '1190,1220p'Repository: arkade-os/go-sdk
Length of output: 1226
🏁 Script executed:
cat -n ./test/infra/mockboltz/server.go | sed -n '1220,1240p'Repository: arkade-os/go-sdk
Length of output: 784
🏁 Script executed:
grep -n "fetchArkSignerPubKey" ./test/infra/mockboltz/server.go | head -10Repository: arkade-os/go-sdk
Length of output: 191
🏁 Script executed:
cat -n ./test/infra/mockboltz/server.go | sed -n '170,190p'Repository: arkade-os/go-sdk
Length of output: 725
🏁 Script executed:
cat -n ./test/infra/mockboltz/server.go | sed -n '130,180p'Repository: arkade-os/go-sdk
Length of output: 1564
Add request deadline to Ark signer discovery HTTP calls.
http.Get at line 1198 uses the default client with no timeout. A stalled arkd connection will block New() indefinitely and hang the mock stack despite the retry loop.
Suggested fix
func fetchArkSignerPubKey(arkdURL string) (*btcec.PublicKey, error) {
if strings.TrimSpace(arkdURL) == "" {
return nil, fmt.Errorf("MOCK_BOLTZ_ARKD_URL is required")
}
url := strings.TrimRight(strings.TrimSpace(arkdURL), "/") + "/v1/info"
+ client := &http.Client{Timeout: 5 * time.Second}
var lastErr error
for i := 0; i < 30; i++ {
- resp, err := http.Get(url)
+ resp, err := client.Get(url)
if err != nil {
lastErr = err
time.Sleep(500 * time.Millisecond)
continue
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for i := 0; i < 30; i++ { | |
| resp, err := http.Get(url) | |
| if err != nil { | |
| lastErr = err | |
| time.Sleep(500 * time.Millisecond) | |
| continue | |
| } | |
| body, _ := io.ReadAll(resp.Body) | |
| _ = resp.Body.Close() | |
| if resp.StatusCode != http.StatusOK { | |
| lastErr = fmt.Errorf("http %d: %s", resp.StatusCode, strings.TrimSpace(string(body))) | |
| time.Sleep(500 * time.Millisecond) | |
| func fetchArkSignerPubKey(arkdURL string) (*btcec.PublicKey, error) { | |
| if strings.TrimSpace(arkdURL) == "" { | |
| return nil, fmt.Errorf("MOCK_BOLTZ_ARKD_URL is required") | |
| } | |
| url := strings.TrimRight(strings.TrimSpace(arkdURL), "/") + "/v1/info" | |
| client := &http.Client{Timeout: 5 * time.Second} | |
| var lastErr error | |
| for i := 0; i < 30; i++ { | |
| resp, err := client.Get(url) | |
| if err != nil { | |
| lastErr = err | |
| time.Sleep(500 * time.Millisecond) | |
| continue | |
| } | |
| body, _ := io.ReadAll(resp.Body) | |
| _ = resp.Body.Close() | |
| if resp.StatusCode != http.StatusOK { | |
| lastErr = fmt.Errorf("http %d: %s", resp.StatusCode, strings.TrimSpace(string(body))) | |
| time.Sleep(500 * time.Millisecond) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/infra/mockboltz/server.go` around lines 1197 - 1208, The http.Get call
used during Ark signer discovery (in the New() flow of the mock server) has no
timeout and can hang; replace the plain http.Get(url) with a request that
enforces a deadline—create a context with timeout (e.g., context.WithTimeout) or
use an http.Client with a Timeout and perform client.Do(req) (or
http.NewRequestWithContext) so each retry is bounded, propagate/close the
response body as before, and surface the context/client error into lastErr for
the retry loop.
This moves Boltz’s Swap & VHTLC pkg’s from fulmine to go-sdk.
This adjust testing infrastructure and add adjusted tests(without Fulmine clinet) for swaps/vhtlc.
Summary by CodeRabbit
New Features
Infrastructure
Tests
Chores