Skip to content

feat(prepro): add --retry-reverse-complement=true when running nextclade in prepro#6065

Merged
anna-parker merged 1 commit into
mainfrom
retry_reverse_complement
Mar 2, 2026
Merged

feat(prepro): add --retry-reverse-complement=true when running nextclade in prepro#6065
anna-parker merged 1 commit into
mainfrom
retry_reverse_complement

Conversation

@anna-parker

@anna-parker anna-parker commented Mar 2, 2026

Copy link
Copy Markdown
Contributor

resolves #2652

Screenshot

PR Checklist

  • All necessary documentation has been adapted.
  • The implemented feature is covered by appropriate, automated tests.
  • Any manual testing that has been done is documented (i.e. what exactly was tested?)

🚀 Preview: https://retry-reverse-complement.loculus.org

@anna-parker anna-parker marked this pull request as ready for review March 2, 2026 10:30
@anna-parker anna-parker added the preview Triggers a deployment to argocd label Mar 2, 2026
@claude

claude Bot commented Mar 2, 2026

Copy link
Copy Markdown
Contributor

Claude finished @anna-parker's task in 1m 25s —— View job


Code Review

  • Read changed files
  • Analyze the change in context
  • Provide review feedback

The PR adds --retry-reverse-complement=true to both invocations of nextclade3 run in the preprocessing pipeline — once in assign_segment_with_nextclade_align (for segment classification) and once in enrich_with_nextclade (for main alignment/QC). The flag is applied consistently across both call sites, which is correct.

What the flag does: When a sequence fails to align, nextclade retries with the reverse complement. This improves robustness for sequences submitted in the wrong orientation.

One thing worth considering: The nextclade JSON output includes an isReverseComplement field when a sequence was aligned as its reverse complement. Currently this field is not surfaced to submitters as a warning. It could be worth adding a warning annotation when result["isReverseComplement"] is True (in parse_nextclade_json or wherever the JSON metadata is processed) — so submitters know their sequence was flipped. That said, this would be a follow-up improvement and isn't a blocker for this PR.

Testing: No tests have been added or updated. Given this is a flag change to an external tool call rather than new business logic, that's reasonable — the behavior of nextclade itself can't easily be unit-tested here.

No opt-out: The flag is hardcoded to true with no config option to disable it. If there are organisms/datasets where reverse complement retry is undesirable (e.g. where strand orientation is meaningful metadata), this can't be turned off without a code change. Worth noting, but may not matter in practice.

Overall this is a sensible improvement to alignment robustness — no issues blocking merge.

@claude claude Bot added the preprocessing Issues related to the preprocessing component label Mar 2, 2026
@claude

claude Bot commented Mar 2, 2026

Copy link
Copy Markdown
Contributor

This PR may be related to: #2652 (Handling reverse complemented sequences)

@anna-parker anna-parker merged commit e331b29 into main Mar 2, 2026
64 checks passed
@anna-parker anna-parker deleted the retry_reverse_complement branch March 2, 2026 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

preprocessing Issues related to the preprocessing component preview Triggers a deployment to argocd

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handling reverse complemented sequences

2 participants