Skip to content

Error when running on a subset of 10x Xenium Human Ovarian Cancer #12

@VPetukhov

Description

@VPetukhov

I'm running ComSeg on a Xenium Human Ovarian Cancer dataset. Running on a full dataset (~120M molecules) takes forever, so as a test I made a subset of 290k molecules. Below is my code:

dict_scale = {"x": 0.2125, 'y': 0.2125, "z": 0.2125}

### create the dataset object
dataset = comseg.dataset.ComSegDataset(
    path_dataset_folder=path_dataset_folder,
    path_to_mask_prior=path_to_mask_prior,
    dict_scale=dict_scale,
    mask_file_extension=".tif",
    mean_cell_diameter=mean_cell_diameter,
    prior_name='in_nucleus'
)

dataset.add_prior_from_mask(overwrite=True)

dataset.compute_edge_weight( 
    images_subset=None,
    n_neighbors=40,
    sampling=True,
    sampling_size=10000
)

Comsegdict = dictionary.ComSegDict(
    dataset=dataset,
    mean_cell_diameter=mean_cell_diameter,
    community_detection="with_prior"
)

Comsegdict.compute_community_vector()

Comsegdict.compute_insitu_clustering(
    size_commu_min=3,
    norm_vector=True,
    ### parameter clustering
    n_pcs=3,
    n_comps=3,
    clustering_method="leiden",
    n_neighbors=20,
    resolution=1,
    n_clusters_kmeans=4,
    palette=None,
    nb_min_cluster=0,
    min_merge_correlation=0.8,
)

It works fine till the last function call (Comsegdict.compute_insitu_clustering), which fails with the following logs:

0%|                                                                                                                                                        | 0/1 [00:00<?, ?it/s]
improvement of modularity 0.04263385318290602
improvement of modularity 0.005007403551788303
improvement of modularity 0.0006468670214521133
improvement of modularity 2.5626005440537725e-05
[/home/vpetukhov/seg-errors/segmentation/env_sopa/lib/python3.12/site-packages/comseg/model.py:352](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a22706c616e636b227d.vscode-resource.vscode-cdn.net/home/vpetukhov/seg-errors/segmentation/env_sopa/lib/python3.12/site-packages/comseg/model.py:352): FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  for node in self.community_anndata.obs["node_index"][comm_index]:
[/home/vpetukhov/seg-errors/segmentation/env_sopa/lib/python3.12/site-packages/comseg/model.py:354](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a22706c616e636b227d.vscode-resource.vscode-cdn.net/home/vpetukhov/seg-errors/segmentation/env_sopa/lib/python3.12/site-packages/comseg/model.py:354): FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  nn_expression_vector = nn_expression_vector [/](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a22706c616e636b227d.vscode-resource.vscode-cdn.net/) len(self.community_anndata.obs["node_index"][comm_index])
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [03:30<00:00, 210.56s/it]
Writing temporary files...
Running scTransform via Rscript...

Attaching package: arrow

The following objects are masked from package:feather:

    read_feather, write_feather

The following object is masked from package:utils:

    timestamp

Calculating cell attributes from input UMI matrix: log_umi
Variance stabilizing transformation of count matrix of size 3784 by 18972
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 18972 cells
  |                                                                      |   0%Error in getGlobalsAndPackages(expr, envir = envir, globals = globals) : 
  The total size of the 19 globals exported for future expression (FUN()) is 1.21 GiB.. This exceeds the maximum allowed size of 500.00 MiB (option 'future.globals.maxSize'). The three largest globals are FUN (1.14 GiB of class function), umi_bin (72.60 MiB of class numeric) and data_step1 (526.63 KiB of class list)
 ... getGlobalsAndPackagesXApply -> getGlobalsAndPackages
Execution halted
Reading output files...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions