Homework 9: Common nonparametric tests

Objective:

Learn how to use R and Rcmdr to apply common nonparametric alternative tests
To compare output from parametric tests and their nonparametric alternatives
To weigh and evaluate output from parametric and nonparametric tests against assumptions.

Homework 9 expectations

Read through the entire homework before starting to answer a question. You are expected to have read the chapter and to have completed preceding homework. Answers are provided to odd numbered problems — turn in your work for even numbered problems.

How to work this homework

You may work together, but each of your must turn in your own report. Don’t “plagiarize” from each other. Do include in your report who you worked with.

What to turn in: A pdf file containing relevant R code, statistical results — edited to support your answers to the questions, and your answer to the questions (even numbered only). Use of RMarkdown recommended — because it is a simple way to include graphs generated; however copy/paste into a word document is also acceptable.

Notes. By relevant we mean provide just the R code and results from R functions necessary to support your answers to the questions. For example, do not include

the entire data set when head(dataset) will do
screenshots of R output!! R output is text — copy/paste
all statistical output from an R function.

See Part09: Making a report for an example homework file.

Submit your work to CANVAS. Obey proper file naming formats.

Resources for this homework

Chapter 15. Mike’s Biostatistics Book

Mike’s Workbook for Biostatistics: A quick look at R and R Commander, Part01 – Part10 and previous homework pages presented in this workbook.

Additional R commands and or code provided below.

Answers to selected problems

Questions

You’ll need to load the ccc data set into R/Rcmdr. Data set published at end of this page

Select all that apply. Assumptions of parametric tests include
A) Data come from normal distributed population.
B) Equal variances among groups.
C) Independence of errors.
D) Sample size equal among groups.
E) Subjects may be present in more than one treatment group.
If parametric tests are used, but one or more assumptions are violated, what are the implications?
Select all that apply. If one or more parametric test assumptions are violated, what options are available to the analyst?
A) Data transform
B) Distribution-free methods like bootstrap
C) Evaluate the model fit
D) Nonparametric alternative
E) Proceed any how because parametric tests are generally robust to minor assumption violations
F) Rank the outcome variable and run the parametric ANOVA
True/False. Nonparametric tests make no assumption about the data. Explain your choice.
One could take the position that only nonparametric alternative tests should be employed in place of parametric tests, in part because they make fewer assumptions about the data. Why is this position unwarranted?
For a data set suitable for a one-way ANOVA, what alternatives are available for analysis if equal variance among groups assumption is suspect?
A) ANOVA by ranks
B) bootstrap ANOVA F test statistic
C) Kruskal-Wallis test
D) Proceed any how because ANOVA is generally robust to unequal variances
True/False. Resampling tests like the bootstrap are types of nonparametric tests.

This question lists all fourteen statistical tests we have been introduced to so far
a. Mark yes or no as to whether or not the test is a parametric test
b. Identify the nonparametric test(s) with their equivalent parametric test(s). If there are no equivalency, simply write “none.”

	Parametric test? Yes/No	If nonparametric, write the number(s) of the tests that the nonparametric test serves as an alternate for
1. ANOVA by ranks
2. Bartlett Test
3. Chi-squared contingency table
4. Chi-squared goodness of fit
5. Fisher Exact test
6. Independent sample T-test
7. Kruskal-Walis test
8. Levene Test
9. One-sample T-test
10. One-way ANOVA
11. Paired T-test
12. Shapiro-Wilks test
13. Tukey posthoc comparisons
14. Welch’s test

Conduct an independent t-test and a separate Wilcoxon test on the Lizard body mass data (data set in Ch15.2 and repeated below).
Make a box plot to display the two groups and describe the middle and variability.
Compare results of test of hypothesis. Do they agree with the Wilcoxon test? If not, list possible reasons why the two tests disagree.
Using the Arabidopsis thaliana plants grown in common garden dataset below, test null hypothesis using one-way ANOVA, one-way ANOVA on ranks, and nonparametric Kruskal Wallis test.
Make a box plot to display the two groups and describe the middle and variability.
Compare results of test of hypothesis. Do they agree? If not, list possible reasons why the tests disagree.

Datasets

By now you should be comfortable getting data like this into R. We want a stacked worksheet. For review, see Part 07. Working with your own data.

Dataset: Lizard body mass

Geckos: 3.186, 2.427, 4.031, 1.995
Anoles: 5.515, 5.659, 6.739, 3.184

presented in Ch15.2.

Dataset: Leaf lengths of three genetic strains of Arabidopsis thaliana plants grown in common garden

arabid leaf
1 wt 4.909
2 wt 5.736
3 wt 5.108
4 AS1 6.956
5 AS1 5.809
6 AS1 6.888
7 AS2 4.768
8 AS2 4.209
9 AS2 4.065

presented in Ch12.2.

/MD