Homework 9: Common nonparametric tests
Objective:
- Learn how to use R and Rcmdr to apply common nonparametric alternative tests
- To compare output from parametric tests and their nonparametric alternatives
- To weigh and evaluate output from parametric and nonparametric tests against assumptions.
Homework 9 expectations
Read through the entire homework before starting to answer a question. You are expected to have read the chapter and to have completed preceding homework. Answers are provided to odd numbered problems — turn in your work for even numbered problems.
How to work this homework
You may work together, but each of your must turn in your own report. Don’t “plagiarize” from each other. Do include in your report who you worked with.
What to turn in: A pdf file containing relevant R code, statistical results — edited to support your answers to the questions, and your answer to the questions (even numbered only). Use of RMarkdown recommended — because it is a simple way to include graphs generated; however copy/paste into a word document is also acceptable.
Notes. By relevant we mean provide just the R code and results from R functions necessary to support your answers to the questions. For example, do not include
- the entire data set when head(dataset) will do
- screenshots of R output!! R output is text — copy/paste
- all statistical output from an R function.
See Part09: Making a report for an example homework file.
Submit your work to CANVAS. Obey proper file naming formats.
Resources for this homework
Chapter 15. Mike’s Biostatistics Book
Mike’s Workbook for Biostatistics: A quick look at R and R Commander, Part01 – Part10 and previous homework pages presented in this workbook.
Additional R commands and or code provided below.
Questions
You’ll need to load the ccc data set into R/Rcmdr. Data set published at end of this page
- Select all that apply. Assumptions of parametric tests include
A) Data come from normal distributed population.
B) Equal variances among groups.
C) Independence of errors.
D) Sample size equal among groups.
E) Subjects may be present in more than one treatment group. - If parametric tests are used, but one or more assumptions are violated, what are the implications?
- Select all that apply. If one or more parametric test assumptions are violated, what options are available to the analyst?
A) Data transform
B) Distribution-free methods like bootstrap
C) Evaluate the model fit
D) Nonparametric alternative
E) Proceed any how because parametric tests are generally robust to minor assumption violations
F) Rank the outcome variable and run the parametric ANOVA - True/False. Nonparametric tests make no assumption about the data. Explain your choice.
- One could take the position that only nonparametric alternative tests should be employed in place of parametric tests, in part because they make fewer assumptions about the data. Why is this position unwarranted?
- For a data set suitable for a one-way ANOVA, what alternatives are available for analysis if equal variance among groups assumption is suspect?
A) ANOVA by ranks
B) bootstrap ANOVA F test statistic
C) Kruskal-Wallis test
D) Proceed any how because ANOVA is generally robust to unequal variances - True/False. Resampling tests like the bootstrap are types of nonparametric tests.
- This question lists all fourteen statistical tests we have been introduced to so far
a. Mark yes or no as to whether or not the test is a parametric test
b. Identify the nonparametric test(s) with their equivalent parametric test(s). If there are no equivalency, simply write “none.”Parametric test?
Yes/NoIf nonparametric, write the number(s) of the tests that the nonparametric test serves as an alternate for
1. ANOVA by ranks
2. Bartlett Test
3. Chi-squared contingency table
4. Chi-squared goodness of fit
5. Fisher Exact test
6. Independent sample T-test
7. Kruskal-Walis test
8. Levene Test
9. One-sample T-test
10. One-way ANOVA
11. Paired T-test
12. Shapiro-Wilks test
13. Tukey posthoc comparisons
14. Welch’s test
- Conduct an independent t-test and a separate Wilcoxon test on the Lizard body mass data (data set in Ch15.2 and repeated below).
Make a box plot to display the two groups and describe the middle and variability.
Compare results of test of hypothesis. Do they agree with the Wilcoxon test? If not, list possible reasons why the two tests disagree. - Using the Arabidopsis thaliana plants grown in common garden dataset below, test null hypothesis using one-way ANOVA, one-way ANOVA on ranks, and nonparametric Kruskal Wallis test.
Make a box plot to display the two groups and describe the middle and variability.
Compare results of test of hypothesis. Do they agree? If not, list possible reasons why the tests disagree.
Datasets
By now you should be comfortable getting data like this into R. We want a stacked worksheet. For review, see Part 07. Working with your own data.
Dataset: Lizard body mass
Geckos: 3.186, 2.427, 4.031, 1.995 Anoles: 5.515, 5.659, 6.739, 3.184
presented in Ch15.2.
Dataset: Leaf lengths of three genetic strains of Arabidopsis thaliana plants grown in common garden
arabid leaf 1 wt 4.909 2 wt 5.736 3 wt 5.108 4 AS1 6.956 5 AS1 5.809 6 AS1 6.888 7 AS2 4.768 8 AS2 4.209 9 AS2 4.065
presented in Ch12.2.
/MD