Help with R package digitize

Background

From time to time we need to extract data from images. Today, we mostly work from digital images, but the principles date back to photography. For example, you may need to measure the length of a shell. In general, if a known unit of measure is included in the image, then a calibration can be obtained between number of pixels and units of measurement (mm, cm, etc.). Lengths on uncalibrated images are recorded as number of pixels.

This post is not about how to acquire good quality images suitable for “science,” but instead briefly introduces you to free software that can be used to extract data from images with the R package digitize, a tool to help extract data from plot images.

While I prefer imageJ, the R package digitize is useful for 2D images.

Wait! Is this legal??

The task described here is broadly termed scraping, and the typical rules apply. If the figure is copyright-protected — and images published in the vast majority of scientific journals comes with copyright protection — if data used without permission, then nope, probably not legal. Caveat rade. Within reason, you are welcome to scrape Mike’s Workbook for Biostatistics or Mike’s Biostatistics Book.

SOP R package digitize

Modified from https://lukemiller.org/index.php/2011/06/digitizing-data-from-old-plots-using-digitize/

Step 1. Load the image, choose x1, x2, y1, y2

Step 2. Create an object in R to gather points, then return to figure. Click on points. Right-click to stop.

Step 3. Convert raw x,y into scale of original image.

Example image Drosophila fly wings.

Scatterplot Drosophila fly wings, Chapter 16.1 Mike's Biostatistics Book

Figure 1. Scatterplot Drosophila fly wings, Chapter 16.1 Mike’s Biostatistics Book.

# sample script
library(digitize)
cal = ReadAndCal("plot_FlyWings.png")
data.points = DigitData(col = "red")
df = Calibrate(data.points, cal, 1.6, 2.2, 0.4, 0.9)
# Check the first 6 rows
head(df)

R output

x y
1 -0.06987472 -0.002066145
2 0.18542807 0.017338403
3 0.52846999 0.040014348
4 0.82221519 0.069146998
5 0.95902492 0.113827391
6 1.30817240 0.258078772