Answers — Hwk2A

Homework 2A: Descriptive statistics

Homework02A: Answers odd numbered problems

Update 2025-09-08: added “dispersion problems.”

Note: Straight R code, suitable for CoLab or other implementations of R.

Central tendency

1. R code

# Question 1
?median

R response

opens html help page in default browser.

3. R code

# question 3. Confirm your hand calculations from #2.
y <- c(1,1,3,6)
mean(y)
median(y)
# Function for mode
temp = table(as.vector(y))
names (temp)[temp==max(temp)]

R response

2.75
2
'1'

5. R code

# Question 5.
names (temp)[temp==min(temp)]

R response

'3''6'

7a.
R code

# 7a
x <- c(30.5,48.8, 37.4,56.6,31, 50, 61.2, 74, 63.4, 47.6)
mean(x); median(x)

R response

50.05
49.4

Optional, try a function: hint — I asked AI to help then modified the reply.

report_stats <- function(x) {
# Calculate statistics, removing NA values
stats <- c(
mean = mean(x, na.rm = TRUE),
median = median(x, na.rm = TRUE),
sd = sd(x, na.rm = TRUE)
)
return(stats)
}

report_stats(x)

R response

mean median sd
50.0500 49.4000 14.2663

7b.
R code

# moms
mom <- c(67,66.5,64,58.5,68,66.5)
# dads
dad <- c(78.5,75.5,75,75,74,74)
height <- data.frame(mom, dad); height # unstacked
long.height <- stack(height, select = c("mom", "dad")); long.height
names(long.height)[names(long.height) == "values"] <- "height"
names(long.height)[names(long.height) == "ind"] <- "parent"
long.height # stacked

R response

# unstacked
A data.frame: 6 × 2
mom dad
<dbl> <dbl>
67.0 78.5
66.5 75.5
64.0 75.0
58.5 75.0
68.0 74.0
66.5 74.0

# stacked
A data.frame: 12 × 2
height parent
<dbl> <fct>
67.0 mom
66.5 mom
64.0 mom
58.5 mom
68.0 mom
66.5 mom
78.5 dad
75.5 dad
75.0 dad
75.0 dad
74.0 dad
74.0 dad
# repeat code from 7a, or improve by use of aggregate() command.
# get mean and median at same time
aggregate(height ~ parent, data = long.height, function(x) c(mean = mean(x), median = median(x)))

R response

A data.frame: 2 × 2
parent height
<fct> <dbl[,2]>
mom 65.08333, 66.5
dad 75.33333, 75.0

7c.
R code

co2 <- c(316.19, 319.42, 325.13, 330.62, 338.29, 346.12, 354.41, 360.82, 396.83, 380.31, 389.99, 402.06, 414.26)
mean(co2); median(co2)

R response

359.573076923077
354.41

7d.
R code

bufo <- c(71.3, 71.4, 74.1, 85.4, 85.4, 86.6, 97.4, 99.6, 107, 115.7, 135.7, 156.2)
# Set reasonable significant figures for mean.
print(paste("mean: ", signif(mean(bufo), digits=3)))
print(paste("median: ", median(bufo)))

R response

"mean: 98.8"
[1] "median: 92"

Dispersion

1. Do by hand, but here’s what the R code looks like
R code

y = c(1,1,3,6)
range(y)
sd(y)
var(y)

R output

1 6 # therefore the range is 5
2.362908
5.583333

3.
R code

x <- c(4,4,4,4,5,6,6,6,7,7,8,8,8,8,8)
IQR(x)
sd(x)
sd(x)/sqrt(length(x))

R output

3.5
1.656157
0.427618

5.
R code

# Code comment -- by appending ";x4" at the end of the line, R 
# prints the object x4. Otherwise, have to print(x4) to see x4 output.
x4 <- sample(x,4); x4
x8 <- sample(x,8); x8 
x12 <- sample(x,12); x12

R output

[1] 4 4 6 4
[1] 8 8 5 8 4 4 6 8
[1] 6 4 7 8 4 6 8 4 4 7 8 6

7. range, IQR, sample standard deviation, and coefficient of variation

question7a <- c(30.5,48.8, 37.4,56.6,31, 50, 61.2, 74, 63.4, 47.6)
# Nothing wrong with just reporting one at a time. We can do better, 
# per option modified from 7a (above) in Central Tendency response.
report_dev <- function(x) {
stats <- c(
range = range(x),
IQR = IQR(x),
sd = sd(x),
cv = sd(x)/sqrt(length(x))
)
return(stats)
}

report_dev(question7a)

# R output

   range1    range2       IQR        sd       cv 
30.500000 74.000000 20.100000 14.266297 4.511399
# question07b
# We created a long data frame called long.height 7b (above) in Central Tendency
aggregate(height ~ parent, data = long.height, 
function(x) c(range = range(x), IQR = IQR(x), sd = sd(x), cv = sd(x)/sqrt(length(x))))

R output

parent height.range1 height.range2 height.IQR height.sd height.cv
1 mom     58.5000000    68.0000000  2.2500000 3.4844894 1.4225369
2 dad     74.0000000    78.5000000  1.1250000 1.6633300 0.6790516

# question07c
# Note — we already created a function. see response to 7a, to get these calculation.
R code

report_dev(co2)

R output

    range1     range2       IQR        sd      cv 
316.190000 414.260000 59.370000 33.813483 9.378173

# question07d.
# same note as above, use our new function.
R code

report_dev(bufo)

R output

   range1     range2       IQR        sd       cv 
71.300000 156.200000 26.600000 26.348843 7.606256

 

/MD