Parallel implementation of inferring gene regulatory networks using an information theoretic approach with boolean networks. Algorithm adopted from Barman & Kwon: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0171097 using mutual information-based feature selection (MIFS).
-
To generate output images, graphviz must be installed on local machine:
brew install graphvizOR
sudo apt install graphviz -
Running
dot -vshould output something like:dot - graphviz version 7.0.4 (20221203.1631)
From GRNPar directory, run:
stack setup
stack build
To run the executable, run:
stack exec GRNPar-exe <csvFilename> <k> <genExpressions> <genImage> <mode>
- csvFilename: gene-expression time-series data
- k: fixed number of input nodes for each target node
- genExpressions: whether or not to generate boolean expressions for each node (1 = True, 0 = False)
- genImage: whether or not to generate image of boolean network (1 = True, 0 = False)
- If 1, then outputFile will be equal to csvFilename with png extension
- mode: "seq" or "par" (seq for sequential algorithm and par for parallel implementation)
To run parallel on 4 cores and sequential implementations, and only generate boolean expressions (no image):
stack exec GRNPar-exe "src/data/nodes_100_time_300.csv" 5 1 0 par -- +RTS -ls -N4OR
stack exec GRNPar-exe "src/data/nodes_100_time_300.csv" 5 1 0 seq -- +RTS -lsthreadscope GRNPar-exe.eventlog
To generate both boolean expressions and an image:
stack exec GRNPar-exe "src/data/nodes_500_time_300.csv" 5 1 1 par -- +RTS -ls -N4OR
stack exec GRNPar-exe "src/data/nodes_500_time_300.csv" 5 1 1 seq -- +RTS -lsthreadscope GRNPar-exe.eventlog
To run on E. coli dataset on 8 cores with k = 5 and generate expressions and an image:
stack exec GRNPar-exe "src/data/e_coli.csv" 5 1 1 par -- +RTS -ls -N8
threadscope GRNPar-exe.eventlog
Requires Python and pandas.
From GRNPar, run:
python src/generate_data.py --numNodes 100 --time 300 --outputFile "src/data/nodes_100_time_300.csv"
- Creates a random gene expression time-series consisting of 100 genes (nodes) with 300 timesteps.