You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for developing this wonderful analysis workflow.
I'm running the pipeline outlined in the notebook LME_Classification.ipynb. This section has the following two plots:
These plots represent the distribution of the mean expression for all genes. The interpretation of it is the following:
In the plot on the right, the expression values start off very low and then rise before dropping down. This pattern suggests potential RNA degradation, which can compromise the reliability and accuracy of downstream analyses. In contrast, the distribution plot on the left shows good-quality gene expression data. Deviations from such distributions may indicate gene degradation, should be carefully investigated and, if necessary, corrected to ensure high-quality data.
This is how my distribution looks like
However, I don't understand how this should be problematic. A common pre-processing step in any RNA-seq analysis is to exclude lowly expressed genes, which do not contain enough information for robust statistical analysis. This is the plot in my R markdown notebook where I choose the expression to exclude genes:
which looks like the plot on the left. Thus, after filtering all I'm left with is highly expressed, reliable genes. What does that have to do with RNA degradation?
If you could explain it it'd be super useful.
Thanks!
Ramon
The text was updated successfully, but these errors were encountered:
Hi,
Thanks for developing this wonderful analysis workflow.
I'm running the pipeline outlined in the notebook LME_Classification.ipynb. This section has the following two plots:
These plots represent the distribution of the mean expression for all genes. The interpretation of it is the following:
In the plot on the right, the expression values start off very low and then rise before dropping down. This pattern suggests potential RNA degradation, which can compromise the reliability and accuracy of downstream analyses. In contrast, the distribution plot on the left shows good-quality gene expression data. Deviations from such distributions may indicate gene degradation, should be carefully investigated and, if necessary, corrected to ensure high-quality data.
This is how my distribution looks like
However, I don't understand how this should be problematic. A common pre-processing step in any RNA-seq analysis is to exclude lowly expressed genes, which do not contain enough information for robust statistical analysis. This is the plot in my R markdown notebook where I choose the expression to exclude genes:
which looks like the plot on the left. Thus, after filtering all I'm left with is highly expressed, reliable genes. What does that have to do with RNA degradation?
If you could explain it it'd be super useful.
Thanks!
Ramon
The text was updated successfully, but these errors were encountered: