03-fixed_effects.Rmd

# Fixed effects meta-analysis

## A brief introduction to fixed effects models in meta-analysis  
The first type of model that many meta-analysts learn about and use is called a fixed effects model. Fixed effects models have strict assumptions about the population that the samples, and thus effect sizes, in a dataset come from.  Fixed effects models assume that the samples come from the *the same* larger population. So when we think about meta-analyses that aggregate data from multiple very differently designed studies, it's easy to see how ecological meta-analyses may quickly violate the assumptions of a fixed effects model. When the studies in a meta-analysis have different experimental designs, occur in a variety of locations around the globe or have different focal species or taxon, these are all good indications that fixed effects model assumptions will not be met.  

We note that, fixed effects models are often used in medical meta-analyses due to the highly controlled experimental designs required in medical research. You might be able to envision a medical meta-analyst finding enough studies that test a new drug using very similarly designed Randomized Control Trials, and so the assumptions of a fixed effects model may not be violates. 

Despite the limitations of fixed effects models, there are some researchers that identify enough studies from the same population for fixed effects modeling [@bai_meta-analysis_2013]. The fixed effects models also provide a useful starting point for our journey through ecological meta-analysis because many of the other methods build off the initial fixed effects framework statistically and as implemented programmatically.

## Fixed effects model example

Having previously run the R code in earlier chapters, you should have the necessary data called `meta_analysis_data` in your R environment.

Now, let's make sure R libraries we need for the rest of this chapter are loaded  

```{r}
library(metafor)
```

The `rma` function let's us specify the type of model we'll use to aggregate the many individual effect sizes in our database into a single summary effect size.  Remember from chapter 2, our dataframe now contains two columns with effect size data. One column has the effect size (either SMD or RR) for each row in our database comparing richness at invaded sites to richness at control sites. This is column `yi`. We also have column `vi` which contains the variance for each effect size in our database. Now, we use these two new columns in the `rma` function.  We can save the model output as an object called `fixed_effect_model_results`.

```{r}
fixed_effect_model_results <- rma(yi, # this is the effect size from each row in database
                                   vi, # measure of variance from each row in database
                                   method = "FE", # Specifies fixed effects model
                                   slab = paste(lastname, publication_year, sep = ""), # This line of code prepares study labels for the forest plot we'll create at the end of the chapter
                                   data = RR_effect_sizes) # Let R know the dataframe we'll use for our model calculations
```

Because we have not taken the steps to impute missing standard deviations you should see an error above that rows with missing data were removed from the analysis.

Now, let's take a look at the fixed effects model results.  

```{r}
fixed_effect_model_results
```

From this model output we can see that after removing NAs from the dataset, we were left with 84 rows of data containing 84 effect sizes (`yi`) and 84 associated measures of variance (`vi`). Looking at the model results section we see a summary effect size of `r round(fixed_effect_model_results$b, 4)`. Remember, the negative effect size indicates a species richness decrease at sites with invasive species. The p-value of <0.0001 suggest that the summary effect size we found in this meta-analysis are significantly different from a zero effect.  This aligns with the two values provided for lower and upper 95% confidence interval values that do not overlap zero (CI = `r round(fixed_effect_model_results$ci.lb, 4)`, `r round(fixed_effect_model_results$ci.ub, 4)`). Confidence intervals that do not overlap zero are evidence for statistically significant summary effects. Confidence intervals around the summary effect size are shown in the key primary data visualization tool used in meta-analysis, the forest plot.

## Forest plot for fixed effects model
The forest plot is an essential visualization for every meta-analysis. However, the way that scientists choose to present their forest plot varies person-to-person. In this book, we create all forest plots using the `forest` function that is installed as part of the `metafor` package. We have found it to be the most flexible way to build the wide variety of plots that we use thoughout this book. However we recommend you see what you like best for your data visualizations. For example, the package `metaviz` [@kossmier_metaviz:_2018] creates wonderful forest plots using model output from `metafor` and just a few lines of code.

Here is the code for our first forest plot using the `metafor` package

```{r, fig.height= 12}
forest(
  RR_effect_sizes$yi, # These are effect sizes from each row in database
  RR_effect_sizes$vi, # These are variances from each row in database
  annotate = FALSE, # Setting this to false prevents R from including CIs for each of the 84 effect sizes in the forest plot. Setting it to TRUE is generally a good practice, but would make this plot cluttered.
  slab = fixed_effect_model_results$slab, # Along the left hand side, we want each individual effect size labeled with author names and publication years. We specified this when we calculated the summary effect size above
  xlab = "ln(Response Ratio)", # Label for x-axis
  cex = .8, # Text side for study labels
  pch = 15, # shape of bars in forest plot
  cex.lab = 1, # Size of x-axis label
)

# This is code adds in the summary effect size for your meta-analysis
addpoly(
  fixed_effect_model_results, # specify the model where your summary effect comes from.
  col = "orange", # Pick a color that makes the summary effect stand out from the other effect sizes.
  cex = 1, # size for the text associates with summary effect
  annotate = TRUE, # Usually, we set this to true. It makes sure effect size and CI for summary effect size is printed on forest plot.
  mlab = "Summary" # Label for summary effect size
)
```

Thought the figure is a little (OK very) busy, it contains so much useful information we can pick apart.  Then, in later chapters, we can discuss how to make the complex forest plots a bit more concise.  But for now let's look at the forest plot in three sections: **left column**, **middle column**, and **right column**.

1. In the **left column**, you see the last name of an article's first author along with the year of the article's publication.  Some of these article identifiers have a `.1` or `.2` after the name and year. This is because, in the forest plot, each effect size extracted from an article is represented in the plot.  So looking at the list of authors and years, you should notice that two effect sizes (combinations of richness values for invaded and uninvaded sites) were extracted from the same article by *Frappier et al. 2013*. This is likely because the article studied the richness impacts of more than one invasive species and we made a rule in our inclusion criteria that if more than one invasive species impact was studied in an article, we wound treat these as separate rows in our data extraction spreadsheet.

2. In the **middle column**, you will see many dots of varying size that each have a 95% confidence interval around the dots. Each dot and CI represents the calculated effect size (in our case Response Ratio) for a pair of richness measurements extracted from an article. The range of  effect size magnitudes are listed along the x-axis of the forest plot. You see in this plot that effect sizes range from approximately -4.0 to +4.0. The confidence interval is based on the standard deviation and sample size extracted from each article. The size of each dot in the forest plot corresponds to the sample size associated with effect size calculations.  This means that studies based on large sample sizes (e.g., *Flinn et al. 2014*) has it's effect size represented with a large black dot and studies with smaller sample sizes (e.g., *Nascimbene et al. 2012*) have effect sizes indicated with small black dots. In general, as sample sizes increase, you will see a shrinking of the confidence intervals around the effect size. Importantly, confidence intervals that overlap the zero line in a forest plot, indicated here with a dashed vertical line indicate a non-significant effect size.  This concept of non-significant effect sizes overlapping zero applies to the article-level effect sizes represented in each row in the forest plot as well as the summary effect size displayed in the very last row of the plot.  The last row of the forest plot shows the summary effect size that was calculated by the fixed effects model we ran in this chapter. When the meta-analysis summarizes a large number of articles, the effect size typically has a  narrow confidence interval that may be difficult to visualize in the plot. This is why we annotate the plot in the right column.

3. In the **right column** of the forest plot, we have a list of each effect size and confidence interval calculated in chapter 2 of this book using the `escalc` function. The first number listed in the right columns the effect size, and the two numbers in the brackets are lower and upper confidence interval values.  Again, if the confidence interval does not include zero, this indicates that either the individual effect size for a single row, or the summary effect for the entire meta-analysis indicates a significant difference from zero.  If displaying each unique effect size and CI adds too much clutter to your plot they can be easily removed by changing the code above from `annotate = TRUE` to `annotate = FALSE` in the forest plot code above. We suggest that you always display the effect size and CI for the summary effect size located at the bottom of your forest plot.  This is done using a separate `addpoly` function in `metafor`.  As part of the `addpoly` function you can specify exactly how the summary effect appears both in the figure (e.g., with a colorful box or diamond) and in text (e.g., bold, italics, red font, larger text).