SMEP

Smart Model for Epigenetics in Plants

Introduction

A Smart Model for Epigenetics in Plants (SMEP) was constructed to identify various epigenomic sites using deep neural networks (DNNs) approach. The SMEP prediction leverages associations with DNA 5-methylcytosine (5mC) and N6-methyladenosine (6mA) methylation, RNA N6-methyladenosine (m6A) methylation, histone H3 lysine-4 trimethylation (H3K4me3), H3K27me3, and H3 lysine-9 acetylation (H3K9ac). You can use this code to train your own prediction models, or employed the constructed model to predict the modification of the input sequence in Rice.

System requirement

Python 2.7 or Python 3.5
tensorflow 2.0.0
keras 2.3.1
theano 1.0.4

Quick Start to install the required program

Install the python 2.7 or 3.5 from Anaconda https://www.anaconda.com/
pip install tensorflow==2.0.0 (python=2.7) or pip install tensorflow==2.0.0-alpha0 (python=3.5)
pip install keras==2.3.1
pip install theano==1.0.4
git clone https://github.com/BRITian/smep

Training models

The program smep_train_py2.7.py or smep_train_py3.5.py was used to train the prediction model, in the python environment 2.7 or 3.5, respectively. There are four parameters that should be provided with the following order, training filename, test filename, sequence length and class number of the model. In the coding file, the first coloum is the label of the sequence, which is the modified or unmodified state. The followings in the line is the coding data, and each nucleotide is encoded as a number, which the A is encoded as the 0, T is encoded as 1, C is encoded as 2 and G is encoed as 3. Different epigenetic modifications use different training parameters. The followings are the examples to construct the predicting models in the python environment 2.7.

The 5mC predicting model
python smep_train_models_py2.7.py example_Train_file_5mC example_Test_file_5mC model.h5 41 4
The 6mA predicting model
python smep_train_models_py2.7.py example_Train_file_6mA example_Test_file_6mA model.h5 41 2
The m6A predicting model
python smep_train_models_py2.7.py example_Train_file_m6A example_Test_file_m6A model.h5 800 2
The histone H3 lysine-4 trimethylation (H3K4me3) predicting model
python smep_train_models_py2.7.py example_Train_file_H3K4me3 example_Test_file_H3K4me3 model.h5 800 2
The histone H3 lysine-27 trimethylation (H3K27me3) predicting model
python smep_train_models_py2.7.py example_Train_file_H3K27me3 example_Test_file_H3K27me3 model.h5 800 2
The histone H3 lysine-9 acetylation (H3K9ac) predicting model
python smep_train_models_py2.7.py example_Train_file_H3K9ac example_Test_file_H3K9ac model.h5 800 2

The model file will be constructed in the current directory.

Prediction the modifications in the sequence

The main program smep_prediction.pl could be used to predict the modification in the sequence. There are three parameters (-I -T -O) that should be provided.

perl smep_prediction.pl -I input_fasta_sequence -T modification_type -O output_file

-I, The input sequence with fasta format
-T, The epigenetic modification type. There are six pre-constructed models (5mC, 6mA, m6A, H3K27me3, H3K4me3 or H3K9ac)
-O, The output file

The followings are some command examples.
perl smep_prediction_p2.7.pl -I test_5mC.fasta -O test_5mC.out -T 5mC
perl smep_prediction_p2.7.pl -I test_6mA.fasta -O test_6mA.out -T 6mA
perl smep_prediction_p2.7.pl -I test_m6A.fasta -O test_m6A.out -T m6A
perl smep_prediction_p2.7.pl -I test_ H3K27me3.fasta -O test_H3K27me3.out -T H3K27me3
perl smep_prediction_p2.7.pl -I test_ H3K4me3.fasta -O test_H3K4me3.out -T H3K4me3
perl smep_prediction_p2.7.pl -I test_ H3K9ac.fasta -O test_H3K9ac.out -T H3K9ac

The predicted results were saved in the output file. In the predicted file, the first column is the fragment number. The second and third column are the sequence ID and the location of the first nucleic acid in the fragment. The fourth and fifth columns are the predicted flag for the modification marker and the probability. The sixth column is the sequence of the fragment. The flag and its corresponding modification were shown as the followings.

5mC, 0 (No modification), 1 (CG), 2(CHG), 3(CHH).
For the other modifications (6mA, m6A, H3K4me3, H3K27me3 and H3K9ac), the number 0 and 1 represented the non-modification and modification, respectively.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
Best_model_rice5mc.h5		Best_model_rice5mc.h5
Best_model_rice6ma.h5		Best_model_rice6ma.h5
Best_model_riceH3K27me3.h5		Best_model_riceH3K27me3.h5
Best_model_riceH3K4me3.h5		Best_model_riceH3K4me3.h5
Best_model_riceH3K9ac.h5		Best_model_riceH3K9ac.h5
Best_model_ricem6a.h5		Best_model_ricem6a.h5
LICENSE		LICENSE
README.md		README.md
example_Test_file_5mC		example_Test_file_5mC
example_Test_file_6mA		example_Test_file_6mA
example_Test_file_H3K27me3		example_Test_file_H3K27me3
example_Test_file_H3K4me3		example_Test_file_H3K4me3
example_Test_file_H3K9ac		example_Test_file_H3K9ac
example_Test_file_m6A		example_Test_file_m6A
example_Train_file_5mC		example_Train_file_5mC
example_Train_file_6mA		example_Train_file_6mA
example_Train_file_H3K27me3		example_Train_file_H3K27me3
example_Train_file_H3K4me3		example_Train_file_H3K4me3
example_Train_file_H3K9ac		example_Train_file_H3K9ac
example_Train_file_m6A		example_Train_file_m6A
smep_prediction_py2.7.pl		smep_prediction_py2.7.pl
smep_prediction_py2.7.py		smep_prediction_py2.7.py
smep_prediction_py3.5.pl		smep_prediction_py3.5.pl
smep_prediction_py3.5.py		smep_prediction_py3.5.py
smep_train_models_py2.7.py		smep_train_models_py2.7.py
smep_train_models_py3.5.py		smep_train_models_py3.5.py
test_5mc.fasta		test_5mc.fasta
test_6mA.fasta		test_6mA.fasta
test_H3K27me3.fasta		test_H3K27me3.fasta
test_H3K4me3.fasta		test_H3K4me3.fasta
test_H3K9ac.fasta		test_H3K9ac.fasta
test_m6A.fasta		test_m6A.fasta

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SMEP

Introduction

System requirement

Quick Start to install the required program

Training models

Prediction the modifications in the sequence

About

Releases 1

Packages

Languages

License

BRITian/SMEP

Folders and files

Latest commit

History

Repository files navigation

SMEP

Introduction

System requirement

Quick Start to install the required program

Training models

Prediction the modifications in the sequence

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages