Homework 2B: Graphs
This homework is about exploring data and is in two parts: this part, Homework 2B is about data visualization. This homework is about creating graphics to display and communicate results. Homework 2A was about describing the middle and the variability of data using descriptive statistics.
BI311 students: What to turn in
BI-311 students: For homework reports, report only your answers for the even numbered questions; answers to the odd numbered problems are provided to you. Read through the entire homework before starting to answer a question — all questions are intended to help you achieve the learning outcomes for the chapter. It is recommended that you work through the odd numbered problems on your own to confirm your work and as a guide to work the other problems.
Homework 2B expectations
Read through the entire homework before starting to answer a question. All of the coding required was included in Chapter 4 of Mike’s Biostatistics Book. See also relevant tutorials in R work.
How to work this homework
You may work together, but each of your must turn in your own report. Don’t “plagiarize” from each other. Do include in your report who you worked with.
What to turn in: A pdf file containing your answers to the even-numbered questions and relevant R code; by relevant we mean include your code, not copies of code provided to you. For statistical results, report appropriate significant figures. Use of RMarkdown recommended — because it is a simple way to include graphs generated; however copy/paste into a word document, then converted to pdf, is also acceptable.
Notes. By relevant we mean provide just the R code and results from R functions necessary to support your answers to the questions. For example, do not include
- the entire data set when head(dataset) will do
- screenshots of R output!! R output is text — copy/paste
- all statistical output from an R function.
See Part09: Making a report for an example homework file.
Submit your work to CANVAS. Obey proper file naming formats.
Resources for this homework
Mike’s Biostatistics Book: Chapter 4
Answers to odd numbered questions, Answers – Hwk2B.
Mike’s Workbook for Biostatistics: A quick look at R and R Commander, Part01 – Part10 and refer also to code presented in Homework 2 from this workbook.
Additional R commands and or code provided below.
Questions
- Make a boxplot of the median values of the basal five hour fasting plasma glucose-to-insulin ratio of four inbred strains of mice, in order DBA, BL6, FVB, and 129.
strains <- c("DBA","DBA","DBA","DBA","BL6","BL6","BL6","BL6","FVB","FVB","FVB","FVB","129","129","129","129")
x <- c(30.5,48.8, 37.4,56.6,
120.4,149,122.7,104.4,
182.8,186.7,110.1,102.8,
135.8,129.7,131.1,90.3) - Why should you use box plots and not bar charts to display comparisons for a ratio scale variable between categories? Obtain a copy of the article by Streit and Gehlenbor 2014 — it’s free! After reading, summarize the pro and cons for box plots over bar charts with error bars.
- Enter the following data into R. The data are sulfate levels in stream water, parts per million (data from M. Dohm).
d.month = c("Jan","Jan", "Jan","Mar","Mar","Mar")
sulfateppm =c(14.3, 14.8, 14.8,9.3,9.4,9.3)
try = data.frame(d.month,sulfateppm)
# Get the means
byWater = tapply(try$sulfateppm,list(Group=try$d.month),mean); byWater
Make a simple boxplot using theboxplot2function ingplotspackage.
(a) Change the range of values on the vertical axis to 0, 20,ylim=c(0,20).
(b) Change the color of the bars from gray to blue.
(c) Add a label to the vertical axis, “Sulfates, ppm.”
(d) Add a red box around the graph. - Make a scatterplot of Height (inches) of mothers by Height (inches) of fathers,
mom <- c(67, 66.5, 64, 58.5, 68, 66.5) #(data from GaltonFamilies in R package HistData)
and fathers,
dad <- c(78.5, 75.5, 75, 75, 74, 74) #(data from GaltonFamilies in R package HistData) - Make a plot of Carbon dioxide (CO2) readings from Mauna Loa for the month of December for demi-decade 1960 – 2020
years <-c (1960, 1965, 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010, 2015, 2020) #obviously, do not calculate statistics on years; you can use to make a plot
co2 <- c(316.19, 319.42, 325.13, 330.62, 338.29, 346.12, 354.41, 360.82, 396.83, 380.31, 389.99, 402.06, 414.26) #data from Dr. Pieter Tans, NOAA/GML (gml.noaa.gov/ccgg/trends/) and Dr. Ralph Keeling, Scripps Institution of Oceanography (scrippsco2.ucsd.edu/) - Make a box plot of body mass of Rhinella marina (formerly Bufo marinus),
bufo <- c(71.3, 71.4, 74.1, 85.4, 85.4, 86.6, 97.4, 99.6, 107, 115.7, 135.7, 156.2) - Make a histogram for each data set
(a) Body temperature readings, deg C, from IR device on two body regions (groups).
body.T <- c(33.2,34.3,34.8,36.1,35.9,35.1,34,34.2,33.6,35.4,35.3,35.4,33.6,33.7,33.8,32.9,34.8,33.7,34.8,33.7,33.1,36.2,34.3,36.3)(b) Distance of darts thrown from bullseye (cm), by two individual dart throwers, aa & bb.
body.R <- c("Forehead","Forehead","Forehead","Throat","Throat","Throat","Forehead","Forehead","Forehead","Throat","Throat","Throat","Forehead","Forehead","Forehead","Throat","Throat","Throat","Forehead","Forehead","Forehead","Throat","Throat","Throat")
dart.D <- c(4.06,8.89,0.00,10.16,11.43,0.00,7.62,7.62,7.37,9.14, NA,10.67)
tossed.by <- c("aa","aa","aa","aa","aa","aa","bb","bb","bb","bb","bb","bb")
(c) Maximum length (cm) on six mollusk species.
shell.length <- c(14.1,17.2,17.6,8,6.83,6.75,6.3,7.7,7.6,6.1,7.2,4.6,17,13.6,13.5,18.5,15.3,19,6.4,7.5,7,7.3,9.1,9)
mollusk.group <- c("SeaStar","SeaStar","SeaStar","Snail","Snail","Snail","SandDollar","SandDollar","SandDollar","Conus","Conus","Conus","Starfish","Starfish","Starfish","Starfish","Starfish","Starfish","Seashell","Seashell","Seashell","Seashell","Seashell","Seashell") - Make a box plot by groups for each of the datasets listed in question 7.
- Re-create heat map for Comet assay experiment described in . The data, a R code to load the data, were
myData <- read.table(header=TRUE, sep=",", text=" Copper, Hazel, HazelCopper 0.02404672, 0.007185706, 0.02663191 0.06711479, 0.027020958, 0.03181153 0.12196060, 0.037725842, 0.03743693 0.13308991, 0.044762867, 0.03851548 0.13344032, 0.045809398, 0.18787608 0.17537831, 0.060942269, 0.19494708 ")
(a) Make the heat map with the default settings, including dendrograms.
(b) Make another heat map, but do not make dendrograms.
(c) Compare graph a and graph b. Did the order change?
(d) Change the color pallet and recreate graph b
R commands, copy, paste & modify to Script
To access R help menu for a command, add question mark in front of the command. For example, ?plot brings up the help page in your default browser