Skip to content

Commit

Permalink
Fixe examples
Browse files Browse the repository at this point in the history
  • Loading branch information
aubreyodom committed Jan 25, 2024
1 parent 9c61eb4 commit 5e4e044
Show file tree
Hide file tree
Showing 31 changed files with 226 additions and 69 deletions.
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@
^LICENSE\.md$
^vignettes\docs
^Legato-docs
^pkgdown

8 changes: 4 additions & 4 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
Package: LegATo
Title: LegATo: Longitudinal mEtaGenomic Analysis Toolkit
Version: 0.0.0.9000
Version: 0.99.0
Authors@R: c(
person("Aubrey", "Odom", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0001-7113-7598")),
person("Yilong", "Zhang", , "[email protected]", role = "ctb"),
person("Jared", "Pincus", , "[email protected]", role = "csl"),
person("Jared", "Pincus", , "[email protected]", role = "csl",
comment = c(ORCID = "0000-0001-6708-5262")),
person("Jordan", "Pincus", , "[email protected]", role = "art")
)
Description: Streamlining longitudinal microbiome profiling in
Bioconductor.
Description: Streamlining longitudinal microbiome profiling in Bioconductor.
License: MIT + file LICENSE
Depends:
R (>= 4.3.0)
Expand Down
13 changes: 13 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# LegATo 0.99.0 (Spring 2024)

* Pre-Release version of LegATo

## Bug Fixes
* None to report

## Major Changes
* Readying for the big leagues
* The Pincuses contributed the etymology and icon

## Minor Changes
* Nearly as many as the major changes
2 changes: 1 addition & 1 deletion R/filter_animalcules_MAE.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ utils::globalVariables(".")
#' @importFrom rlang .data
#'
#' @examples
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") %>% readRDS()
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") |> readRDS()
#' filter_animalcules_MAE(in_dat, 0.01)
#'

Expand Down
4 changes: 2 additions & 2 deletions R/get_long_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#'
#' This function takes a \code{MultiAssayExperiment} object and a specified
#' taxon level of interest and creates a long \code{data.frame} that can be used
#' more easily for plotting counts.
#' more easily for plotting counts data.
#'
#' @inheritParams plot_stacked_bar
#' @param log logical. Indicate whether an assay returned should be the log of
Expand All @@ -21,7 +21,7 @@
#' @importFrom rlang .data
#'
#' @examples
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") %>% readRDS()
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") |> readRDS()
#' out <- get_long_data(in_dat, "genus", log = TRUE, counts_to_CPM = TRUE)
#' head(out)
#'
Expand Down
14 changes: 12 additions & 2 deletions R/get_stacked_data.R
Original file line number Diff line number Diff line change
@@ -1,9 +1,19 @@
#' Documentation
#' Create a long data.frame with grouped abundances from a MultiAssayExperiment counts object
#'
#' This function takes a \code{MultiAssayExperiment} object and a specified
#' taxon level of interest and creates a long \code{data.frame} that can be used
#' more easily for plotting counts data in a stacked bar plot or a stacked area
#' chart. The function groups taxa and computes relative abundance within taxa strata.
#'
#' @inheritParams plot_spaghetti
#'
#' @return A \code{data.frame} consisting of the counts data, taxa, and metadata.
#'
#' @export
#' @importFrom rlang .data
#'
#' @examples
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") %>% readRDS()
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") |> readRDS()
#' get_stacked_data(in_dat, "genus", covariate_1 = "Sex", covariate_time = "Month")
#'

Expand Down
2 changes: 1 addition & 1 deletion R/get_summary_table.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
#' @importFrom rlang .data
#'
#' @examples
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") %>% readRDS()
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") |> readRDS()
#' out <- get_summary_table(in_dat, c("Group", "Subject"))
#' head(out)
#'
Expand Down
2 changes: 1 addition & 1 deletion R/get_top_taxa.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
#' @importFrom rlang .data
#'
#' @examples
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") %>% readRDS()
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") |> readRDS()
#' out <- get_top_taxa(in_dat, "genus")
#' out
#'
Expand Down
2 changes: 1 addition & 1 deletion R/parse_MAE_SE.R
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
#' @import MultiAssayExperiment
#'
#' @examples
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") %>% readRDS()
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") |> readRDS()
#' out <- parse_MAE_SE(in_dat)
#' head(out$tax)
#' head(out$sam)
Expand Down
2 changes: 1 addition & 1 deletion R/plot_alluvial.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
#' @importFrom rlang .data
#'
#' @examples
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") %>% readRDS()
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") |> readRDS()
#' plot_alluvial(in_dat, taxon_level = "family", covariate_1 = "Group", covariate_time = "Month",
#' palette_input = rainbow(25))
#'
Expand Down
2 changes: 1 addition & 1 deletion R/plot_spaghetti.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
#' @importFrom rlang .data
#'
#' @examples
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") %>% readRDS()
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") |> readRDS()
#' all_taxa <- get_top_taxa(in_dat, "phylum")
#' plot_spaghetti(in_dat, taxon_level = "phylum", covariate_1 = "Group", covariate_time = "Month",
#' unit_var = "Subject", which_taxon = all_taxa$taxon[1],
Expand Down
2 changes: 1 addition & 1 deletion R/plot_stacked_area.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
#' @importFrom rlang .data
#'
#' @examples
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") %>% readRDS()
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") |> readRDS()
#' plot_stacked_area(in_dat, taxon_level = "phylum", covariate_1 = "Group",
#' covariate_time = "Month",
#' palette_input = rainbow(25))
Expand Down
2 changes: 1 addition & 1 deletion R/plot_stacked_bar.R
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
#' @importFrom rlang .data
#'
#' @examples
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") %>% readRDS()
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") |> readRDS()
#' plot_stacked_bar(in_dat, taxon_level = "family", covariate_1 = "Group",
#' covariate_time = "Month",
#' palette_input = rainbow(25))
Expand Down
50 changes: 36 additions & 14 deletions R/run_gee_model.R
Original file line number Diff line number Diff line change
Expand Up @@ -42,23 +42,44 @@ test_models_gee <- function(tn, input_df, unit_var, fixed_cov,
return(res_out)
}

#' Compute Generalized Estimating Equations (GEEs)
#'
#' Run an independent GEE model for each taxa with relative abundance
#' Works well with small data - multiple subpoints/subjects across clusters
#'
#' Source
#' https://data.library.virginia.edu/getting-started-with-generalized-estimating-equations/
#'
#' fixed_cov is a vector
#'
#' Compute Generalized Estimating Equations (GEEs) on longitudinal microbiome
#' data
#'
#' This function takes an animalcules-formatted \code{MultiAssayExperiment} and
#' runs an independent GEE model for each taxon. The model predicts taxon log
#' CPM abundance as a product of fixed-effects covariates conditional on a
#' grouping ID variable, usually the unit on which repeated measurements were
#' taken. This modeling approach works best with small datasets that multiple
#' samples across many (>40) clusters/units.
#'
#' P-values are adjusted for the model coefficients within each taxon. The
#' following methods are permitted: \code{c("holm", "hochberg", "hommel",
#' "bonferroni", "BH", "BY", "fdr", "none")}
#'
#' @inheritParams test_hotelling_t2
#' @param fixed_cov A character vector naming covariates to be tested.
#' @param corstr A character string specifying the correlation structure. The
#' following are permitted: '"independence"', '"exchangeable"', '"ar1"',
#' '"unstructured"'.
#' @param p_adj_method A character string specifying the correction method. Can
#' be abbreviated. See details. Default is \code{"fdr"}.
#' @param plot_out Logical indicating whether plots should be output alongside
#' the model results. Default is \code{FALSE}.
#' @param plotsave_loc A character string giving the folder path to save plot
#' outputs. This defaults to the current working directory.
#' @param plot_terms Character vector. Which terms should be examined in the
#' plot output? Can overlap with the \code{fixed_cov} inputs.
#' @param ... Further arguments passed to \code{ggsave} for plot creation.
#'
#' @export
#' @importFrom rlang .data
#'
#'
#' @examples
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") %>% readRDS()
#' in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") |>
#' readRDS()
#' out <- run_gee_model(in_dat, taxon_level = "genus", unit_var = "Subject",
#' fixed_cov = c("HairLength", "Age", "Group", "Sex"), corstr = "ar1")
#' fixed_cov = c("HairLength", "Age", "Group", "Sex"),
#' corstr = "ar1")
#' head(out)
#'

Expand All @@ -67,6 +88,7 @@ run_gee_model <- function(dat,
unit_var,
fixed_cov,
corstr = "ar1",
p_adj_method = "fdr",
plot_out = FALSE,
plotsave_loc = ".",
plot_terms = NULL,
Expand All @@ -84,7 +106,7 @@ run_gee_model <- function(dat,
data.table::rbindlist() %>%
dplyr::arrange(.data$Coefficient) %>%
dplyr::group_by(.data$Coefficient) %>%
dplyr::mutate("Adj p-value" = stats::p.adjust(.data$`Pr(>|W|)`, method = "bonferroni")) %>%
dplyr::mutate("Adj p-value" = stats::p.adjust(.data$`Pr(>|W|)`, method = p_adj_method)) %>%
dplyr::rename("Unadj p-value" = .data$`Pr(>|W|)`) %>%
as.data.frame()
return(storage)
Expand Down
6 changes: 3 additions & 3 deletions R/test_hotelling_t2.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@
n <- input_data %>% dplyr::group_by(Populations) %>%
dplyr::distinct(Subjects) %>% dplyr::summarize("n_col" = dplyr::n())
n1 <- n %>% dplyr::filter(Populations == Group1) %>%
dplyr::select(`n_col`) %>% as.numeric()
dplyr::select("n_col") %>% as.numeric()
n2 <- n %>% dplyr::filter(Populations == Group2) %>%
dplyr::select(`n_col`) %>% as.numeric()
dplyr::select("n_col") %>% as.numeric()
p <- length(unique(input_data$Taxon))
# Sample mean vector
X_i <- input_data %>%
Expand Down Expand Up @@ -219,7 +219,7 @@
#' @importFrom rlang .data
#'
#' @examples
#' dat <- system.file("extdata", "MAE.RDS", package = "LegATo") %>%
#' dat <- system.file("extdata", "MAE.RDS", package = "LegATo") |>
#' readRDS()
#' dat_0.05 <- filter_animalcules_MAE(dat, 0.05)
#' out1 <- test_hotelling_t2(dat = dat_0.05,
Expand Down
16 changes: 11 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# LegATo: Longitudinal mEtaGenomic Analysis Toolkit <img src="https://github.com/aubreyodom/Legato-docs/blob/main/legato-logo.jpg?raw=true" align="right" width="170" />

## What is LegATo?
# LegATo
### A Longitudinal mEtaGenomic Analysis Toolkit <img src="https://github.com/aubreyodom/Legato-docs/blob/main/legato-logo.jpg?raw=true" align="right" width="140">

LegATo is a suite of open-source software tools for longitudinal microbiome analysis. It is extendable to
several different study forms with optimal ease-of-use for researchers. Microbiome time-series data
Expand All @@ -9,13 +9,19 @@ designs. This toolkit will allow researchers to determine which microbial taxa a
perturbations such as onset of disease or lifestyle choices, and to predict the effects of these perturbations
over time, including changes in composition or stability of commensal bacteria.

LegATo integrates visualization, modeling and testing procedures. It is currently in development, but it will soon be supplemented by hierarchical clustering tools and multivariate generalized estimating equations (JGEEs) to adjust for the compositional nature of microbiome data. Other tools will be implemented as needed.
LegATo integrates visualization, modeling and testing procedures. It is currently in development, but it will soon be supplemented by hierarchical clustering tools and multivariate generalized estimating equations (JGEEs) to adjust for the compositional nature of microbiome data.

# Documentation
### The Story Behind the Name
In music, legato indicates that notes are played or sung smoothly and connected, without a noticeable break between them. The LegATo package facilitates a cohesive and interconnected understanding of the microbial communities represented by the samples, much like the smooth connection of musical notes in a legato passage.

Therefore, LegATo metaphorically represents the smooth and connected analysis of longitudinal metagenomic data, drawing inspiration from the musical term to convey a sense of continuity and harmony in the modeling process.

## Documentation
Documentation and tutorials for LegATo are available at our [website](https://aubreyodom.github.io/LegATo-docs/).

Check out a thorough tutorial on proper usage of our package [here](https://aubreyodom.github.io/LegATo-docs/articles/LegATo_vignette.html).

# Installation
## Installation
LegATo requires R Version 4.3.

Install the development version of the package from Github:
Expand Down
2 changes: 1 addition & 1 deletion man/LegATo-package.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/filter_animalcules_MAE.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions man/get_long_data.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

26 changes: 23 additions & 3 deletions man/get_stacked_data.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/get_summary_table.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/get_top_taxa.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/parse_MAE_SE.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/plot_alluvial.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/plot_spaghetti.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 5e4e044

Please sign in to comment.