Title: PANTHER requires two .bim file versions and results in 0 overlapping SNPs

Dear X‑Wing team,

I’m encountering an issue when running PANTHER that might be worth documenting or patching.

When I try to run PANTHER using my target dataset, I run into two conflicting requirements for the .bim file:

munge_ref() expects the BIM file with a header, so it can read column names for SNP alignment. This is consistent with the files that are downloaded off your github page. However,  munge_bim() expects a headerless PLINK‑style BIM file (CHR, SNP, CM, BP, A1, A2), otherwise it throws:

"ValueError: invalid literal for int() with base 10: 'CHR'"

To work around this, I had to create two versions of the same file:

snpinfo_mult_1kg_hm3 → original, headered file (used by munge_ref)

snpinfo_mult_1kg_hm3.bim → headerless copy (used by munge_bim)

This allowed the program to proceed, but at the alignment step (munge_sumstats / align_ldblk) PANTHER reports 0 overlapping SNPs, even though I verified that ~1.17 million SNPs exist in common between: (1) The reference panel (snpinfo_mult_1kg_hm3), (2) The target BIM file (3) The GWAS summary stats. The result from (4) the results from LOGOdetect. Python/Pandas confirms 1,177,049 SNPs overlap across these files.  PANTHER reports 0 overlapping SNPs and no SNPs survive to MCMC.


Example of my files:

Headered BIM (snpinfo_mult_1kg_hm3):

CHR SNP BP A1 A2
1   rs28527770 751756 C T
1   rs3094315  752566 A G


Headerless .bim for PLINK/PANTHER:

1 rs28527770 0 751756 C T
1 rs3094315  0 752566 A G

GWAS summary stats (GSCANEur_Linux.txt):

CHR SNP BP A1 A2 BETA P
1 rs28527770 751756 C T 0.0403 0.9678
1 rs3094315  752566 A G -0.0410 0.9673


A reproducible example is below. I can send the GSCAN files I used. Otherwise, any set of similar formatted SNPs should be sufficient:

Use snpinfo_mult_1kg_hm3 with header + .bim headerless copy.

Run:

bash
Copy
Edit
python PANTHER.py \
  --ref_dir PANTHER_1kg_ref \
  --bim_prefix PANTHER_1kg_ref/snpinfo_mult_1kg_hm3 \
  --sumstats GSCANEur_Linux.txt,GSCANAfr_Linux.txt \
  --n_gwas 724269,158284 \
  --anno_file LOGODetect_Test/annot_EUR.txt,LOGODetect_Test/annot_AFR.txt \
  --chrom 1 \
  --pop EUR,AFR \
  --target_pop AFR \
  --pst_pop AFR \
  --out_name SmkInit \
  --seed 3 \
  --out_dir PANTHER/post
Output: 0 overlapping SNPs

What do you think?
Anyway of updating munge_bim() to detect and skip a header automatically?

Anyway to allow users to pass a single headered .bim file to simplify the workflow?


Thank you for maintaining this tool!
I’m happy to test any patch that resolves the dual‑BIM requirement and the 0‑overlap filtering.

I can also create a small ZIP with the headered BIM + sumstats + GSCAN files if that would help reproduce the behavior.

Alexander

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Title: PANTHER requires two .bim file versions and results in 0 overlapping SNPs #8

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Title: PANTHER requires two .bim file versions and results in 0 overlapping SNPs #8

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions