Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comment on the section: 3.2.4. Data distribution check #3

Open
massonix opened this issue Jul 2, 2024 · 0 comments
Open

Comment on the section: 3.2.4. Data distribution check #3

massonix opened this issue Jul 2, 2024 · 0 comments

Comments

@massonix
Copy link

massonix commented Jul 2, 2024

Hi,

Thanks for developing this wonderful analysis workflow.

I'm running the pipeline outlined in the notebook LME_Classification.ipynb. This section has the following two plots:

image

These plots represent the distribution of the mean expression for all genes. The interpretation of it is the following:

In the plot on the right, the expression values start off very low and then rise before dropping down. This pattern suggests potential RNA degradation, which can compromise the reliability and accuracy of downstream analyses. In contrast, the distribution plot on the left shows good-quality gene expression data. Deviations from such distributions may indicate gene degradation, should be carefully investigated and, if necessary, corrected to ensure high-quality data.

This is how my distribution looks like

image

However, I don't understand how this should be problematic. A common pre-processing step in any RNA-seq analysis is to exclude lowly expressed genes, which do not contain enough information for robust statistical analysis. This is the plot in my R markdown notebook where I choose the expression to exclude genes:

image

which looks like the plot on the left. Thus, after filtering all I'm left with is highly expressed, reliable genes. What does that have to do with RNA degradation?

If you could explain it it'd be super useful.

Thanks!

Ramon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant