Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oNMF call uses max number of cores #12

Open
wmacnair opened this issue Nov 26, 2020 · 2 comments
Open

oNMF call uses max number of cores #12

wmacnair opened this issue Nov 26, 2020 · 2 comments

Comments

@wmacnair
Copy link

Hi

I'm trying to use popalign on a cluster, and finding that I can't stop it from using all available computing power. This is not making me any friends...!

Once I get to the PA.onmf call, every core on the cluster jumps to 100%. This is despite manually setting pop['ncores'] = 8. I've also tried multiple other things without success (see below, where I have thrown in everything I've been able to find)...

Thanks for any help,
Will

import os
os.environ["OMP_NUM_THREADS"]           = "8" # export OMP_NUM_THREADS=1
os.environ["OPENBLAS_NUM_THREADS"]      = "8" # export OPENBLAS_NUM_THREADS=1
os.environ["MKL_NUM_THREADS"]           = "8" # export MKL_NUM_THREADS=1
os.environ["VECLIB_MAXIMUM_THREADS"]    = "8" # export VECLIB_MAXIMUM_THREADS=1
os.environ["NUMEXPR_NUM_THREADS"]       = "8" # export NUMEXPR_NUM_THREADS=1

import popalign as PA
import pickle
import numpy as np
from scipy import io as sio
from multiprocessing import Pool
import torch

torch.set_num_threads(8)

def main(types_list):
    pool        = Pool(processes=8)
    
    save_dir    = 'output/ms27_popalign'
    genes_f     = os.path.join(save_dir, 'oligo_features.tsv')
    samples     = {t: os.path.join(save_dir, t + '.mtx') for t in types_list}

    print('loading samples')
    pop = PA.load_samples(samples=samples, genes=genes_f, outputfolder=save_dir)
    # pop = load_samples_hack(samples=samples, genes=genes_f, outputfolder=save_dir)
    pop['ncores'] = 8
    print('normalizing')
    PA.normalize(pop)
    print('identifying HVGs')
    PA.plot_gene_filter(pop, offset=1.3)
    PA.filter(pop, remove_ribsomal=False, remove_mitochondrial=False)
    print('running oNMF')
    PA.onmf(pop, ncells=5000, nfeats=np.arange(2,20,2).tolist(), nreps=3, niter=200)
    print('running GSEA')
    PA.choose_featureset(pop, alpha = 3, multiplier=3)
    # PA.gsea(pop, geneset='c5bp')

    # save
    print('saving')
    pop_f       = os.path.join(save_dir, 'popalign_obj.p')
    pickle.dump(pop, open(pop_f, "wb"))
@sisichen-dev
Copy link
Collaborator

sisichen-dev commented Nov 26, 2020 via email

@wmacnair
Copy link
Author

Hey Sisi

Thanks for getting back so quickly. In the end I tried it out on the desktop and it ran fine in a few minutes so there's no real urgency to fixing this, at least from my side. I'm also not familiar with python multiprocessing, so I was hoping to learn something ;)

Enjoy Thanksgiving!
Will

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants