Skip to content

Commit 5331b9e

Browse files
JGarnica22mumichae
andauthored
Add CiLISI as new metric component (#57)
* add metric: CiLISI * udpate change log * fix: move clisi comment below the kBET note to avoid having 2 new functionality sections * fix: remove boilerplate comments for better readability * Update base_r container Co-authored-by: Michaela Müller <[email protected]> * fix: standardize R identing Co-authored-by: Michaela Müller <[email protected]> --------- Co-authored-by: Michaela Müller <[email protected]>
1 parent 4e9ebb6 commit 5331b9e

File tree

3 files changed

+106
-0
lines changed

3 files changed

+106
-0
lines changed

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,12 @@
77

88
* Added `ARI_batch` and `NMI_batch` to `metrics/clustering_overlap` (PR #68).
99

10+
* Added `metrics/cilisi` new metric component (PR #57).
11+
- ciLISI measures batch mixing in a cell type-aware manner by computing iLISI within each cell type and normalizing
12+
the scores between 0 and 1. Unlike iLISI, ciLISI preserves sensitivity to biological variance and avoids favoring
13+
overcorrected datasets with removed cell type signals.
14+
We propose adding this metric to substitute iLISI.
15+
1016
## Minor changes
1117

1218
* Un-pin the scPRINT version and update parameters (PR #51)

src/metrics/cilisi/config.vsh.yaml

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
__merge__: ../../api/comp_metric.yaml
2+
name: cilisi
3+
info:
4+
metrics:
5+
- name: cilisi
6+
label: CiLISI
7+
summary: Cell-type aware version of iLISI (Local inverse Simpson's Index).
8+
iLISI is computed separately for each cell type or cluster, normalized between 0 and 1, and averaged across all cells (global mean).
9+
By default, CiLISI is calculated only for groups with at least 10 cells and 2 distinct batch labels (configurable).
10+
description: |
11+
ciLISI measures batch mixing in a cell type-aware manner by computing iLISI within each cell type and normalizing
12+
the scores between 0 and 1. Unlike iLISI, ciLISI preserves sensitivity to biological variance and avoids favoring
13+
overcorrected datasets with removed cell type signals.
14+
references:
15+
doi: 10.1038/s41467-024-45240-z
16+
links:
17+
documentation: https://github.com/carmonalab/scIntegrationMetrics
18+
repository: https://github.com/carmonalab/scIntegrationMetrics
19+
min: 0
20+
max: 1
21+
maximize: true
22+
23+
- name: cilisi_means
24+
label: CiLISI_means
25+
summary: As CiLISI, but returns mean of per-group CiLISI values (i.e., average of the means per group). instead of a global average.
26+
description: |
27+
ciLISI measures batch mixing in a cell type-aware manner by computing iLISI within each cell type and normalizing
28+
the scores between 0 and 1. Unlike iLISI, ciLISI preserves sensitivity to biological variance and avoids favoring
29+
overcorrected datasets with removed cell type signals.
30+
references:
31+
doi: 10.1038/s41467-024-45240-z
32+
links:
33+
documentation: https://github.com/carmonalab/scIntegrationMetrics
34+
repository: https://github.com/carmonalab/scIntegrationMetrics
35+
min: 0
36+
max: 1
37+
maximize: true
38+
resources:
39+
- type: r_script
40+
path: script.R
41+
engines:
42+
- type: docker
43+
image: openproblems/base_r:1
44+
setup:
45+
- type: r
46+
github: https://github.com/carmonalab/[email protected]
47+
runners:
48+
- type: executable
49+
- type: nextflow
50+
directives:
51+
label: [midtime,midmem,midcpu]

src/metrics/cilisi/script.R

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
library(anndata)
2+
library(scIntegrationMetrics)
3+
4+
## VIASH START
5+
par <- list(
6+
input_integrated = "resources_test/task_batch_integration/cxg_immune_cell_atlas/integrated_processed.h5ad",
7+
input_solution = "resources_test/task_batch_integration/cxg_immune_cell_atlas/solution.h5ad",
8+
output = "output.h5ad"
9+
)
10+
meta <- list(
11+
name = "cilisi"
12+
)
13+
## VIASH END
14+
15+
cat("Reading input files\n")
16+
adata <- anndata::read_h5ad(par[["input_integrated"]])
17+
solution <- anndata::read_h5ad(par[["input_solution"]])
18+
embeddings <- adata$obsm[["X_emb"]]
19+
metadata <- solution$obs
20+
21+
cat("Compute CiLISI metrics...\n")
22+
lisisplit <-
23+
scIntegrationMetrics::compute_lisi_splitBy(
24+
X = embeddings,
25+
meta_data = metadata,
26+
label_colnames = "batch",
27+
perplexity = 30,
28+
split_by_colname = "cell_type",
29+
normalize = TRUE,
30+
min.cells.split = 10,
31+
min.vars.label = 2
32+
)
33+
# average CiLISI
34+
cilisi <- mean(unlist(lisisplit))
35+
# Mean per cell type
36+
cilisi_means <- mean(sapply(lisisplit, function(x) mean(x[, 1])))
37+
38+
cat("Write output AnnData to file\n")
39+
output <- anndata::AnnData(
40+
shape = c(1,2),
41+
uns = list(
42+
dataset_id = adata$uns[["dataset_id"]],
43+
normalization_id = adata$uns[["normalization_id"]],
44+
method_id = adata$uns[["method_id"]],
45+
metric_ids = c("cilisi", "cilisi_means"),
46+
metric_values = list(cilisi, cilisi_means)
47+
)
48+
)
49+
output$write_h5ad(par[["output"]], compression = "gzip")

0 commit comments

Comments
 (0)