SMOC

Smart Model for Open chromatin region in Rice

Introduction

Chromatin accessibility is one of the most important chromatin structural features that determine the degree of nuclear macromolecules accessing chromosomal DNA, which is crucial to gene transcription regulation.Here, we designed and developed a rice-specific SMOC tool based on deep learning algorithms that have multiple layers of CNN architecture for OCR prediction in two rice cultivars NIP and 93-11 in normal and heat stress conditions.

System requirement

Python 2.7 or Python 3.5
tensorflow 2.0.0
keras 2.3.1
theano 1.0.4

Quick Start to install the required program

Install the python 2.7 or 3.5 from Anaconda https://www.anaconda.com/
pip install tensorflow==2.0.0 (python=2.7) or pip install tensorflow==2.0.0-alpha0 (python=3.5)
pip install keras==2.3.1
pip install theano==1.0.4
git clone https://github.com/BRITian/smep

Training models

The program smep_train_py2.7.py or smep_train_py3.5.py was used to train the prediction model, in the python environment 2.7 or 3.5, respectively. There are four parameters that should be provided with the following order, training filename, test filename, model name,sequence length and class number of the model. In the coding file, the first coloum is the label of the sequence, which is the modified or unmodified state. The followings in the line is the coding data, and each nucleotide is encoded as a number, which the A is encoded as the 0, T is encoded as 1, C is encoded as 2 and G is encoed as 3. Different epigenetic modifications use different training parameters. The followings are the examples to construct the predicting models in the python environment 2.7.

The OCRs predicting model
python smep_train_models_py2.7.py example_Train_file_NIPCK example_Test_file_NIPCK model.h5 147 2
python smep_train_models_py2.7.py example_Train_file_NIPHS example_Test_file_NIPHS model.h5 147 2
python smep_train_models_py2.7.py example_Train_file_9311CK example_Test_file_9311CK model.h5 147 2
python smep_train_models_py2.7.py example_Train_file_9311HS example_Test_file_9311HS model.h5 147 2

The model file will be constructed in the current directory.

Prediction the modifications in the sequence

The main program smep_prediction.pl could be used to predict the modification in the sequence. There are three parameters (-I -T -O) that should be provided.

perl smep_prediction.pl -I input_fasta_sequence -T modification_type -O output_file

-I, The input sequence with fasta format
-T, The epigenetic modification type. There are six pre-constructed models (5mC, 6mA, m6A, H3K27me3, H3K4me3 or H3K9ac)
-O, The output file

The followings are some command examples.

perl smep_prediction_p2.7.pl -I test_squence.fasta -O test_results.out -T NIPCK

The predicted results were saved in the output file. In the predicted file, the first column is the fragment number. The second and third column are the sequence ID and the location of the first nucleic acid in the fragment. The fourth and fifth columns are the predicted flag for the modification marker and the probability. The sixth column is the sequence of the fragment. The flag and its corresponding modification were shown as the followings. For the OCRs, the number 0 and 1 represented the non-modification and modification, respectively.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Best_model_9311CK.h5		Best_model_9311CK.h5
Best_model_9311HS.h5		Best_model_9311HS.h5
Best_model_NIPCK.h5		Best_model_NIPCK.h5
Best_model_NIPHS.h5		Best_model_NIPHS.h5
LICENSE		LICENSE
README.md		README.md
example_Test_file_9311CK		example_Test_file_9311CK
example_Test_file_9311HS		example_Test_file_9311HS
example_Test_file_NIPCK		example_Test_file_NIPCK
example_Test_file_NIPHS		example_Test_file_NIPHS
group_norm.py		group_norm.py
smoc_prediction_py2.7.pl		smoc_prediction_py2.7.pl
smoc_prediction_py2.7.py		smoc_prediction_py2.7.py
smoc_prediction_py3.5.pl		smoc_prediction_py3.5.pl
smoc_prediction_py3.5.py		smoc_prediction_py3.5.py
smoc_train_models_py2.7.py		smoc_train_models_py2.7.py
smoc_train_models_py3.5.py		smoc_train_models_py3.5.py
test.fasta		test.fasta

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SMOC

Introduction

System requirement

Quick Start to install the required program

Training models

Prediction the modifications in the sequence

About

Releases 1

Packages

Languages

License

BRITian/SMOC

Folders and files

Latest commit

History

Repository files navigation

SMOC

Introduction

System requirement

Quick Start to install the required program

Training models

Prediction the modifications in the sequence

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages