Enhancing flexibility, removing heuristics and changing knn construction #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Subsample based on the percent of cells
I added support for the user to specify the percent of test cells rather than the absolute number to downsample to.
Split test based on percent of cells
I added support for specifying the percent of test cells rather than the absolute number. This solves problems when donors have fewer cells than the absolute number but enough to split on percent.
For reproducibility, I also added support for specifying the seed.
Computing modularity safely
A frequent error was computing modularity for a k outside the range of the merges in the walktrap object. I added a safe way of automatically computing modularity for every k and a column with the differences in modularity to help in a posteriori k selection.
From correlation-based to Euclidean distance
For some matrices, the correlation distance had problems ensuring all the entries were positive. I changed the distance to Euclidean. Unlike
1 - correlation, It does not assume linearity nor normality and does not violate the triangle inequality.Enhanced user control
Training many scHPF models is time-consuming. I added support so that the user can specify 1) whether to rewrite those models and 2) an apriori k. This is much more convenient.
Miscellaneous