Answers — Hwk2A
Homework 2A: Descriptive statistics
Homework02A: Answers odd numbered problems
Update 2025-09-08: added “dispersion problems.”
Note: Straight R code, suitable for CoLab or other implementations of R.
Central tendency
1. R code
# Question 1 ?median
R response
opens html help page in default browser.
3. R code
# question 3. Confirm your hand calculations from #2. y <- c(1,1,3,6) mean(y) median(y) # Function for mode temp = table(as.vector(y)) names (temp)[temp==max(temp)]
R response
2.75 2 '1'
5. R code
# Question 5. names (temp)[temp==min(temp)]
R response
'3''6'
7a.
R code
# 7a x <- c(30.5,48.8, 37.4,56.6,31, 50, 61.2, 74, 63.4, 47.6) mean(x); median(x)
R response
50.05 49.4
Optional, try a function: hint — I asked AI to help then modified the reply.
report_stats <- function(x) {
# Calculate statistics, removing NA values
stats <- c(
mean = mean(x, na.rm = TRUE),
median = median(x, na.rm = TRUE),
sd = sd(x, na.rm = TRUE)
)
return(stats)
}
report_stats(x)
R response
mean median sd 50.0500 49.4000 14.2663
7b.
R code
# moms
mom <- c(67,66.5,64,58.5,68,66.5)
# dads
dad <- c(78.5,75.5,75,75,74,74)
height <- data.frame(mom, dad); height # unstacked
long.height <- stack(height, select = c("mom", "dad")); long.height
names(long.height)[names(long.height) == "values"] <- "height"
names(long.height)[names(long.height) == "ind"] <- "parent"
long.height # stacked
R response
# unstacked A data.frame: 6 × 2 mom dad <dbl> <dbl> 67.0 78.5 66.5 75.5 64.0 75.0 58.5 75.0 68.0 74.0 66.5 74.0 # stacked A data.frame: 12 × 2 height parent <dbl> <fct> 67.0 mom 66.5 mom 64.0 mom 58.5 mom 68.0 mom 66.5 mom 78.5 dad 75.5 dad 75.0 dad 75.0 dad 74.0 dad 74.0 dad
# repeat code from 7a, or improve by use of aggregate() command. # get mean and median at same time aggregate(height ~ parent, data = long.height, function(x) c(mean = mean(x), median = median(x)))
R response
A data.frame: 2 × 2 parent height <fct> <dbl[,2]> mom 65.08333, 66.5 dad 75.33333, 75.0
7c.
R code
co2 <- c(316.19, 319.42, 325.13, 330.62, 338.29, 346.12, 354.41, 360.82, 396.83, 380.31, 389.99, 402.06, 414.26) mean(co2); median(co2)
R response
359.573076923077 354.41
7d.
R code
bufo <- c(71.3, 71.4, 74.1, 85.4, 85.4, 86.6, 97.4, 99.6, 107, 115.7, 135.7, 156.2)
# Set reasonable significant figures for mean.
print(paste("mean: ", signif(mean(bufo), digits=3)))
print(paste("median: ", median(bufo)))
R response
"mean: 98.8" [1] "median: 92"
Dispersion
1. Do by hand, but here’s what the R code looks like
R code
y = c(1,1,3,6) range(y) sd(y) var(y)
R output
1 6 # therefore the range is 5 2.362908 5.583333
3.
R code
x <- c(4,4,4,4,5,6,6,6,7,7,8,8,8,8,8) IQR(x) sd(x) sd(x)/sqrt(length(x))
R output
3.5 1.656157 0.427618
5.
R code
# Code comment -- by appending ";x4" at the end of the line, R # prints the object x4. Otherwise, have to print(x4) to see x4 output. x4 <- sample(x,4); x4 x8 <- sample(x,8); x8 x12 <- sample(x,12); x12
R output
[1] 4 4 6 4 [1] 8 8 5 8 4 4 6 8 [1] 6 4 7 8 4 6 8 4 4 7 8 6
7. range, IQR, sample standard deviation, and coefficient of variation
question7a <- c(30.5,48.8, 37.4,56.6,31, 50, 61.2, 74, 63.4, 47.6)
# Nothing wrong with just reporting one at a time. We can do better,
# per option modified from 7a (above) in Central Tendency response.
report_dev <- function(x) {
stats <- c(
range = range(x),
IQR = IQR(x),
sd = sd(x),
cv = sd(x)/sqrt(length(x))
)
return(stats)
}
report_dev(question7a)
# R output
range1 range2 IQR sd cv 30.500000 74.000000 20.100000 14.266297 4.511399
# question07b # We created a long data frame called long.height 7b (above) in Central Tendency aggregate(height ~ parent, data = long.height, function(x) c(range = range(x), IQR = IQR(x), sd = sd(x), cv = sd(x)/sqrt(length(x))))
R output
parent height.range1 height.range2 height.IQR height.sd height.cv 1 mom 58.5000000 68.0000000 2.2500000 3.4844894 1.4225369 2 dad 74.0000000 78.5000000 1.1250000 1.6633300 0.6790516
# question07c
# Note — we already created a function. see response to 7a, to get these calculation.
R code
report_dev(co2)
R output
range1 range2 IQR sd cv 316.190000 414.260000 59.370000 33.813483 9.378173
# question07d.
# same note as above, use our new function.
R code
report_dev(bufo)
R output
range1 range2 IQR sd cv 71.300000 156.200000 26.600000 26.348843 7.606256
/MD