Skip to content

Commit

Permalink
darn
Browse files Browse the repository at this point in the history
Merge branch 'master' of https://github.com/emilioalaca/PLS120Book2018

# Conflicts:
#	12.0_TrtStructure.Rmd
#	ch2pops.html
#	chTrtstr.html
  • Loading branch information
emilioalaca committed Nov 15, 2018
2 parents 63a95df + 331a8e4 commit 17871a7
Show file tree
Hide file tree
Showing 25 changed files with 2,180 additions and 9,702 deletions.
Binary file modified .DS_Store
Binary file not shown.
664 changes: 509 additions & 155 deletions .Rhistory (1)

Large diffs are not rendered by default.

89 changes: 46 additions & 43 deletions 05.0_Probability.Rmd

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions 06.0_SamplingEstimators.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ In reality, a random variable is a real-valued function defined over the sample
```

Do not confuse this function property with the possibility of having random variables grouped into vectors, where each vector contains more than one value, each value being the realization of a different random variable. For example, in the roll of two die we can define the random vector {number of dots in the first die, number of dots in the second die}. The importance of thinking of a random variable as a function is that it gives a lot of flexibility to deal with the results of random experiments. Instead of just counting dots on the face of a die we can use complicated functions that allow us to test hypotheses about more complex processes, such as administering medicines to animals, or comparing the yield of multiple varieties of food plants.
Do not confuse this function property with the possibility of having random variables grouped into vectors, where each vector contains more than one value, each value being the realization of a different random variable. For example, in the roll of two dice we can define the random vector {number of dots in the first die, number of dots in the second die}. The importance of thinking of a random variable as a function is that it gives a lot of flexibility to deal with the results of random experiments. Instead of just counting dots on the face of a die we can use complicated functions that allow us to test hypotheses about more complex processes, such as administering medicines to animals, or comparing the yield of multiple varieties of food plants.

A more formal definition and an intuitive physical model of random variable is given in [@286171]. Make sure to also explore the explanations given by W. Huber in [@54894].
Expand Down Expand Up @@ -1082,6 +1083,7 @@ par(mfrow = c(1,1))
<br>


The areas under the curve are used to represent proportions of the population with values between the specified extremes. Imagine that the length of avocados received for sorting and packing at a plant has a normal distribution with mean 9 cm and variance 1.21.cm^2. The following are questions that we can answer using the normal distribution (data and situations described are fictitious).
The areas under the curve are used to represent proportions of the population with values between the specified extremes. Imagine that the length of avocados received for sorting and packing at a plant has a normal distribution with mean 9 cm and variance 1.21.cm^^2^^. The following are questions that we can answer using the normal distribution (data and situations described are fictitious).

(@) A buyer wants avocados that are longer than 8 cm. What proportion of the avocados received could go to that buyer?
Expand Down
6 changes: 5 additions & 1 deletion 07.0_ConfIntHoTesting.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,7 @@ The **sampling distribution** of a statistics is the distribution of the values

When the data are normally distributed, or approximately normal given a large sample size (e.g., n >30), tests about population means with **known variance** use the z-score as the statistic. The **z-statistic** is calculated as follows:

<br>
A hypothesis is stated about a population, usually involving a parameter. For example, the mean photosynthetic rate for the non-native eelgrass *Zostera japonica* found on the south side quadrant of Padilla Bay, WA is the same as the mean photosynthetic rate for *Z. japonica* eelgrass found in all quadrants in Padilla Bay.

$$z_{calc} = \frac{\bar{Y} - \mu_0}{\sigma / \sqrt{r}}$$
Expand Down Expand Up @@ -458,25 +459,27 @@ text(x = 20, y = 1.05 * max(ci.hi), labels = coverage, cex = 1.2)
## Confidence Intervals for the Variance

In the section about [$\chi^2$ distribution](#chisqDist) we noted that the $\chi^2$ distribution is the distribution of a random variable created by summing the squares of multiple standard normal random variables. A little of work on the formula for the estimated variance using the sample variance shows that the estimated variance is a random variable with a distribution related to the $\chi^2$.
>>>>>>> EAL11Oct18

<br>
````{block,, type='stattip'}
- Never accept the null hypothesis (unless your power is very large, but that is another story).
````
<br>

\begin{align}
$$\begin{align}
&\hat{\sigma}^2 = S^2 = \sum_{i = 1}^r \ \frac{(Y_i - \bar{Y})^2}{r - 1} \implies S^2 \ (r - 1) = \sum_{i = 1}^r \ (Y_i - \bar{Y})^2 \\[25pt]
&\text{Because } Y_i \sim N(\mu, \sigma^2),\qquad (Y_i - \bar{Y}) \sim N(0, \sigma^2) \ \text{ and} \\[25pt]
&\frac{(Y_i - \bar{Y})}{\sigma} \sim N(0, 1) \ \text{ is a standard normal random variable, and} \\[25pt]
&\frac{S^2 \ (r - 1)}{\sigma^2} = \sum_{i = 1}^r \ \frac{(Y_i - \bar{Y})^2}{\sigma^2} \sim \ \chi^2_{r-1}
\end{align}
\end{align}$$

<br>

Based on this result, we use the $chi^2_{r-1}$ distribution to make confidence intervals for the true variance based on the sample variance. By definition

<br>
<!-- Figure \@ref(fig:prob_dist) is a visual representation of test-statistic values under a normal distribution. -->

$$P \left (\chi^2_{r-1, \ \alpha/2} \ \lt \frac{S^2 \ (r - 1)}{\sigma^2} \lt \chi^2_{r-1, \ 1 - \alpha/2} \right ) = 0.95$$
Expand Down Expand Up @@ -549,6 +552,7 @@ text(x = 20, y = 1.05 * max(ci.hi), labels = coverage, cex = 1.2)

We use the first sample obtained above to illustrate the calculation for a CI.

<br>
### Correct and incorrect interpretation of the Confidence Interval


Expand Down
Loading

0 comments on commit 17871a7

Please sign in to comment.