-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segger in COSMX data - data generation step #74
Comments
Dear @joaolsf, It's great to hear that you tried segger on cosmx data. We need to get to the buttom of this as we haven't yet tested segger on cosmx data thoroughly. It is at the top of our to-do list, though. My first impression is that there would be a bug regarding the graphs built on transcripts that leads to this problem. So if you could share a link (if it's public) to the dataset or share a very small subset of data (let's say 10K transcripts and the corresponding boundaries) then I can go ahead and debug. We're gonna fix this asap. P.S. It would be a great contribution to the package if you could make a PR adding the yaml file/script to run on a cosmx data. This will allow more users to run segger on new datasets. Thanks! |
Hi @EliHei2 , Thanks for the quick response. I thought that could be a reason for the bug, hence I tried to adjust the settings for the transcript neighbours and distances but with no success. I wonder if this is something you can adjust in the sample.py file. I tried with the cosmx 1K lung public dataset as well as with 2 samples of my own (bone marrow and spleen). I can send you a portion of my own data, as it is based on newer versions of ATOMX (the public lung data is based on the first versions of atomx so it has some small differences in the file formatting that can affect the debugging). The cosmx yaml file I produced is just a basic editing of the xenium yaml file. Is there a way I can share the parquet and yaml files with you directly, as this is not published data yet? Would be great if we can stay in touch to work through this because I am quite interested in seeing how segger can improve the original segmentation (Cellpose) and the one I have done in this data with Baysor. Thanks! |
Thanks @joaolsf! please feel free to get in touch via email [email protected]! looking forward to get this going. Best. |
Hi, @EliHei2,
I am testing segger on COSMX data. I am aware the tool has been tested with Xenium and Merscope data, but following the instructions that "Segger can also be extended to other platforms by modifying the column names or formats in the input files to match its expected structure, making it adaptable for various spatial transcriptomics technologies." I assume should work on COSMX. Following the instructions, I modified the column names on the cosmx files accordingly and generated the transcript and boundaries parquet files. The line of codes for "sample = STSampleParquet" and "sample.save" worked just fine, with no errors raised. I used the same settings as in the example notebook. However, there is not tiles or PyTorch geometric output files (.pt ?) in the segger_data_dir (which is generated during the run on sample.save). I played with the tile size, tile width, tile height, played with different parameters in sample.save, I even created a new yaml file in the same folder where 'xenium' and 'xenium_v2' yaml files are, to accommodate the cosmx columns, nothing worked. I don't understand what could be the issue because there is no errors flagged.
I have ran the exact same codes in my machine with your xenium example dataset, and everything works fine, so I dont think there is any issues with the version of segger installed, or any other packages/dependencies. Could this be because the settings for tilling are optimised for xenium but they do not work at all with cosmx data?
Hope you can help me because we are very interested in getting segger to work on cosmx data as well.
Thanks!
Joao
The text was updated successfully, but these errors were encountered: