You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for the great tool! After running Parsnp I wanted to create a SNP distance matrix, per what I could find the in literature, I used snp-dists to create a snp matrix from parsnpLCB.aln file created from the .ggr file parsnp normally outputs. Relevant code and versions are below.
As a sanity check, when I compared the sums of counts >0 (to account for >1 alternative bases) from the vcf columns to the column in the distance matrix for the reference, I see differences among a couple of the total core SNPs (41/4775 isolates). All the differences were due to one additional SNP in the snp-dists version of the core SNPs to the reference.
I wanted to check what could be causing this difference to occur? It seems like snp-dists may have found an additional SNP (or multiple SNPs) that did not make it into the vcf file? Would there be filtering of SNPs between the .ggr file and the creation of the .vcf file?
Alternatively, is there a way within harvestools/parsnp to create a SNP distance matrix from the .ggr or .aln files so I do not have to use an outside tool? Or from the vcf file?
I understand this may be a snp-dists issue, but wanted to ask some Parsnp relevant questions. I hope that is okay.
Example tables are below. Please let me know if you need any additional information.
Thanks for your time and help.
Sincerely,
David
VCF Example
<style>
</style>
Ref
Sample_1
Sample_2
Sampl_3
Position_1
0
1
0
0
Position_2
0
2
1
0
Position_3
0
0
1
2
Position_4
0
0
1
1
Position_5
0
0
1
1
Total SNPs vs Ref
0
2
4
3
snp-dists example
<style>
</style>
Ref
Sample_1
Sample_2
Sample_3
Ref
0
2
4
4
Sample_1
2
0
5
5
Sample_2
4
5
0
4
Sample_3
4
5
4
0
Comparison example
<style>
</style>
vcf_core_snps_Ref
dist_core_snps_Ref
Sample_1
2
2
Sample_2
4
4
Sample_3
3
4
The text was updated successfully, but these errors were encountered:
Dear Parsnp developers,
Thank you for the great tool! After running Parsnp I wanted to create a SNP distance matrix, per what I could find the in literature, I used snp-dists to create a snp matrix from parsnpLCB.aln file created from the .ggr file parsnp normally outputs. Relevant code and versions are below.
Versions:
parsnp 1.7.4
harvestools 1.2
snp-dists 0.8.2
Scripts
parsnp -r SX514.polish.fna -d FSIS_assemblies -o parsnp_SX514_Ref --vcf --threads 12
harvesttools -i parsnp.ggr -M parsnpLCB.aln
snp-dists parsnpLCB.aln > distances.tab -b
snp-dists parsnpLCB.aln > distances_pw.txt -m
As a sanity check, when I compared the sums of counts >0 (to account for >1 alternative bases) from the vcf columns to the column in the distance matrix for the reference, I see differences among a couple of the total core SNPs (41/4775 isolates). All the differences were due to one additional SNP in the snp-dists version of the core SNPs to the reference.
I wanted to check what could be causing this difference to occur? It seems like snp-dists may have found an additional SNP (or multiple SNPs) that did not make it into the vcf file? Would there be filtering of SNPs between the .ggr file and the creation of the .vcf file?
Alternatively, is there a way within harvestools/parsnp to create a SNP distance matrix from the .ggr or .aln files so I do not have to use an outside tool? Or from the vcf file?
I understand this may be a snp-dists issue, but wanted to ask some Parsnp relevant questions. I hope that is okay.
Example tables are below. Please let me know if you need any additional information.
Thanks for your time and help.
Sincerely,
David
VCF Example
<style> </style>snp-dists example
<style> </style>Comparison example
<style> </style>The text was updated successfully, but these errors were encountered: