From 3f3243236ffdf18514b1191a9d03a5eb283658c1 Mon Sep 17 00:00:00 2001 From: Matthew Menold Date: Tue, 28 Nov 2023 15:58:06 -0500 Subject: [PATCH 1/4] Added Error header in rnaseq-rmd.rst and docs for 'Found duplicate names' error in rnaseq.Rmd closes #393 --- docs/rnaseq-rmd.rst | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/docs/rnaseq-rmd.rst b/docs/rnaseq-rmd.rst index 430ea383..666cdcc2 100644 --- a/docs/rnaseq-rmd.rst +++ b/docs/rnaseq-rmd.rst @@ -544,6 +544,41 @@ enrichment RMarkdown document. The output of sessionInfo records the versions of packages used in the analysis. +Errors +------ + +In this section, we address some common errors encountered during the RNA-Seq downstream analysis +and provide guidance on how to resolve them. + +Error: "Found duplicate names after removing pattern ^contr_[^_]+_" +NOTE: Error in:: purrr::map() can also be caused by this: + Cause: + - If there are no duplicate contrast names in the:: results_## chunks: + a more subtle cause for this error occurs when the pattern is altered + on a previously cached chunk. Even if the change is undone, the + environment is cleared, and the file is rerun, the error will persist. + An example of this would be if you completed the:: results_01 chunk + followed by the:: results_02 chunk (with both chunks cached), rendered + the file and then went back and altered the:: contr_01 portion of:: + results_01 to e.g., contr_05, then cleared the environment and rendered + the file. At this point, the R environment contains both the + original:: contr_01 pattern followed by the contrast name as well as + the new contr_05 pattern followed by the contrast name. Since the + remaining "contrast name" portion of the string is shared between + contr_01 and contr_05, this would cause the error as the code is + unable to differentiate between the old and new contr_xx patterns. You + may be wondering, "how is the old pattern persisting in the + R environment after it has been changed and the file has been rerun + (even after quitting R/clearing the environment and workspace)." That + is because the:: results_02 chunk cache contains the former:: + results_01 chunk's objects even after they have been changed. + + Solution: + - All cache and R environment objects must be cleared upstream of the:: + assemble_variables chunk. To do so: Quit R (without saving the + workspace), delete .RData file (present if the workspace was ever + saved), remove all cache files, open rnaseq.Rmd and run again. + Glossary -------- .. glossary:: From 326bb8464eb758bae5183c69efc350679ccfd565 Mon Sep 17 00:00:00 2001 From: Matthew Menold Date: Tue, 28 Nov 2023 16:13:31 -0500 Subject: [PATCH 2/4] Added hyperlink to rnaseq.Rmd 'found duplicate names' error message that leads to docs. Closes issue #395 --- lib/lcdbwf/R/helpers.R | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/lcdbwf/R/helpers.R b/lib/lcdbwf/R/helpers.R index c88779b1..405f3953 100644 --- a/lib/lcdbwf/R/helpers.R +++ b/lib/lcdbwf/R/helpers.R @@ -81,7 +81,7 @@ collect_objects <- function(pattern, fixed=FALSE){ # If there was a wildcard in the pattern there is a risk that the modified # names are no longer unique if (length(unique(modified_names)) != length(var_names)){ - stop(paste("Found duplicate names after removing pattern", pattern)) + stop(paste("Found duplicate names after removing pattern", pattern, "see https://lcdb.github.io/lcdb-wf/rnaseq-rmd.html#Errors for details")) } names(obj_list) <- modified_names return(obj_list) From b335e337d0f9e93990b8e1911d3e35599c240da1 Mon Sep 17 00:00:00 2001 From: Matthew Menold Date: Mon, 3 Jun 2024 15:38:23 -0400 Subject: [PATCH 3/4] Reduced verbosity rnaseq-rmd.rst --- docs/rnaseq-rmd.rst | 40 ++++++++++++++-------------------------- 1 file changed, 14 insertions(+), 26 deletions(-) diff --git a/docs/rnaseq-rmd.rst b/docs/rnaseq-rmd.rst index 666cdcc2..679a5c53 100644 --- a/docs/rnaseq-rmd.rst +++ b/docs/rnaseq-rmd.rst @@ -547,37 +547,25 @@ The output of sessionInfo records the versions of packages used in the analysis. Errors ------ -In this section, we address some common errors encountered during the RNA-Seq downstream analysis -and provide guidance on how to resolve them. +This section addresses errors encountered during the RNA-Seq downstream analysis. Error: "Found duplicate names after removing pattern ^contr_[^_]+_" -NOTE: Error in:: purrr::map() can also be caused by this: + Cause: - - If there are no duplicate contrast names in the:: results_## chunks: - a more subtle cause for this error occurs when the pattern is altered - on a previously cached chunk. Even if the change is undone, the - environment is cleared, and the file is rerun, the error will persist. - An example of this would be if you completed the:: results_01 chunk - followed by the:: results_02 chunk (with both chunks cached), rendered - the file and then went back and altered the:: contr_01 portion of:: - results_01 to e.g., contr_05, then cleared the environment and rendered - the file. At this point, the R environment contains both the - original:: contr_01 pattern followed by the contrast name as well as - the new contr_05 pattern followed by the contrast name. Since the - remaining "contrast name" portion of the string is shared between - contr_01 and contr_05, this would cause the error as the code is - unable to differentiate between the old and new contr_xx patterns. You - may be wondering, "how is the old pattern persisting in the - R environment after it has been changed and the file has been rerun - (even after quitting R/clearing the environment and workspace)." That - is because the:: results_02 chunk cache contains the former:: - results_01 chunk's objects even after they have been changed. + - If no duplicate contrast names exist in your ``results_##`` chunks, + the error can arise from changes to previously cached chunks. + Even if the change is undone and the environment is cleared, + the error may persist. For example, changing ``contr_01`` + in ``results_01`` to ``contr_05`` and then reknitting the file + will cause the R environment to contain both ``contr_01`` and + ``contr_05`` patterns after that chunk is loaded, leading to + this error. Solution: - - All cache and R environment objects must be cleared upstream of the:: - assemble_variables chunk. To do so: Quit R (without saving the - workspace), delete .RData file (present if the workspace was ever - saved), remove all cache files, open rnaseq.Rmd and run again. + - Clear all cache and R environment objects before the + ``assemble_variables`` chunk. Quit R without saving the + workspace, delete the .RData file, remove all ``rnaseq_cache`` files, + open rnaseq.Rmd and knit again. Glossary -------- From bee39e04010352c2dca63757f982e22fe213e0e4 Mon Sep 17 00:00:00 2001 From: Matthew Menold Date: Mon, 3 Jun 2024 15:52:31 -0400 Subject: [PATCH 4/4] minor change to error docs --- docs/rnaseq-rmd.rst | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/rnaseq-rmd.rst b/docs/rnaseq-rmd.rst index 679a5c53..5968ffe3 100644 --- a/docs/rnaseq-rmd.rst +++ b/docs/rnaseq-rmd.rst @@ -557,15 +557,14 @@ Error: "Found duplicate names after removing pattern ^contr_[^_]+_" Even if the change is undone and the environment is cleared, the error may persist. For example, changing ``contr_01`` in ``results_01`` to ``contr_05`` and then reknitting the file - will cause the R environment to contain both ``contr_01`` and + will cause the R environment to contain both ``contr_01`` and ``contr_05`` patterns after that chunk is loaded, leading to this error. Solution: - - Clear all cache and R environment objects before the - ``assemble_variables`` chunk. Quit R without saving the - workspace, delete the .RData file, remove all ``rnaseq_cache`` files, - open rnaseq.Rmd and knit again. + - Clear all cache and R environment objects. + Quit R without saving the workspace, delete the .RData file, + remove the ``rnaseq_cache`` directory, open rnaseq.Rmd and re-knit. Glossary --------