Hypothesis testing
Presentation storyline
- Hypothesis
- A statement that makes a claim to explain a phenomenon
- Hypothesis Test
- We test whether two data sets are consistent with each other. One is idealized and represents a distribution, the other is sampled and represents our sampled distribution. We might say, we test whether our sample is plausibly drawn from the idealized distribution.
- We decide whether our Hypothesis is rejected based on a calculated test statistic.
- Example Hypotheses:
- : and therefore : > two-sided test
- : and therefore : > right-tailed test
- : and therefore : > left-tailed test
- Error types
- Type I : false positive alias significance level
- Type II : false negative
- Compare to Covid-Quick-tests:
- It is tested if the distribution of a protein in your body is similar to that of a healthy person. We assume, this is stated by . If the test picks up a large amount of the protein, it will trigger the coloration of the “second line” and should be rejected. This would state that the distribution of the protein is larger than that of a healthy person. Ergo this is a right-tailed test.
- When creating the test, the sensitivity can be adjusted. A decision needs to be made. With a higher significance level (lower false positive rate), the false negative rate increases
- Critical Value approach
- check whether the value of the test statistic lies in the rejection region
- P-Value approach
- Check the likelihood of observing such data as the sample data given the distribution.
- Procedure (Steps)
- Always perform the 6 steps
- Z-Test ( is known)
- simplest test, can only be applied in rare cases
- T-Test ( is unknown)
- very common to compare sample means
- F-Test
- Fisher test to compare variances/standard deviations
- Two-sample t-test (pooled, assume equal variances)
- Two-sample t-test (non-pooled, do not assume equal variances)
- Paired two-sample t-test (dependent samples)
- Chi-Square test
See also: Hypothesis test, probabilistic model selection, MOC Projects and Research Threads