RPSLearner: A novel approach combining Random Projection and Stacking Learning for categorizing NSCLC

In this study, to address the concerns in NSCLC subtype prediction, we developed RPSLearner which combines RP and stacking learning for effective and accurate classification. It effectively reduced the dimensionality while preserving sample-to-sample distances through RP and integrated fused features and predictions from diverse models through stacking learning. RPSLearner succeeds in boosting classification prediction with higher accuracy, F1 and AUC metrics than conventional machine learning models and state-of-the-art methods. RPSLearner utilized feature fusion strategy which exhibited better performance than score ensemble approaches in subtype prediction. RPSLearner’s results are interpretable that the expression of DEGs aligns well with the published literature, which also offering insights about potential novel biomarkers. This framework could be potentially extended to subtype identification of other cancers.

Flowchart of RPSLearner

Installation

Clone the RPSLearner git repository

git clone https://github.com/wan-mlab/RPSLearner.git

Navigate to the directory of RPSLearner package

cd /your path/RPSLearner
pip install .

Tutorials

How to use the method for RNA-seq data

# Usage Example for RPSLearner

import pandas as pd
from RPSLearner import RPSLearner
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score

data = pd.read_csv('data/rnaseq_tcga.csv')

tpm = data.drop('Subtype', axis=1)
subtype = data['Subtype'] # Use '0' for LUAD, and '1' for LUSC

metrics, y_probs, y_labels = RPSLearner(
    tpm.values, subtype, n_jobs=5)

Analysis reproduce

cor_plot.ipynb could generate the correlation comparison analysis among dimensionality reduction algorithms.
RanBALL_test.py could generate the comparison vs. score average
base_vs_stack.py could generate stacking vs. individual base model
pipeline.py could generate the benchmarking results against State-Of-The-Art methods
DE_analysis.ipynb could generate the differential expression comparison and GO pathway analysis.
drug_disease_gene.ipynb could reproduce the gene-drug-disease interaction analysis for drug-repurposing.

Bug Report

If you find any bugs or problems, or you have any comments on RPSLearner, please don't hesitate to contact via email [email protected]

Authors

Xinchao Wu, Shibiao Wan

Publication

License

GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
DE_analysis.ipynb		DE_analysis.ipynb
README.md		README.md
RPSLearner.png		RPSLearner.png
RPSLearner.py		RPSLearner.py
RanBALL_test.py		RanBALL_test.py
base_vs_stack.py		base_vs_stack.py
cor_plot.ipynb		cor_plot.ipynb
drug_disease_gene.ipynb		drug_disease_gene.ipynb
model.py		model.py
pipeline.py		pipeline.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RPSLearner: A novel approach combining Random Projection and Stacking Learning for categorizing NSCLC

Flowchart of RPSLearner

Table of Contents

Installation

Tutorials

Analysis reproduce

Bug Report

Authors

Publication

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

wan-mlab/RPSLearner

Folders and files

Latest commit

History

Repository files navigation

RPSLearner: A novel approach combining Random Projection and Stacking Learning for categorizing NSCLC

Flowchart of RPSLearner

Table of Contents

Installation

Tutorials

Analysis reproduce

Bug Report

Authors

Publication

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages