Another day, another interesting challenge.
I follow Bob Rudis’ (a.k.a. hrbrmstr’s) blog (rud.is), typically via R-bloggers (www.r-bloggers.com), and this post (rud.is) caught my eye. Partly because I thought I knew of an existing way to do this. As usual, actually getting that to work took a little longer than I might have hoped, but I think the end result is pretty neat.
His post describes the process of writing an R function to take an image file, for example this one

and producing a histogram along the sides of the number of pixels on a given row/column. This is what he created (a different image to the example, I believe)

Something funny is going on with the right-hand histogram; it doesn’t line up with the image.
Here’s my approach.
| setwd("WORKINGDIRECTORY") | |
| library(png) ## split .png into component matrices | |
| library(RSAGA) ## grid.to.xyz, load before dplyr | |
| library(dplyr) ## group_by(), summarise(), %>% | |
| library(ggplot2) ## plotting | |
| library(ggExtra) ## ggMarginal | |
| ## load the .png file | |
| img <- png::readPNG("file10a566a2b4dc3.png") | |
| ## convert each channel into a data.frame | |
| imgdf1 <- RSAGA::grid.to.xyz(img[,,1]) %>% mutate(col="red") # red channel | |
| imgdf2 <- RSAGA::grid.to.xyz(img[,,2]) %>% mutate(col="green") # green channel | |
| imgdf3 <- RSAGA::grid.to.xyz(img[,,3]) %>% mutate(col="blue") # blue channel | |
| ## collect these together to average | |
| imgdf <- bind_rows(imgdf1, imgdf2, imgdf3) | |
| ## shift the coordinates to the correct values, and subtract the pixel values from 1 (white=0) | |
| imgdf$x <- imgdf$x + 1 | |
| imgdf$y <- imgdf$y + 1 | |
| imgdf$z <- 1 - imgdf$z | |
| ## average the three channels (median) at each pixel | |
| imgdf %<>% group_by(x, y) %>% summarise(z=median(z)) | |
| ## plot the pixels, removing white pixels (z==0) | |
| p <- ggplot(imgdf %>% filter(z!=0), aes(x=x, y=y)) + geom_point(size=0.1, pch=19) + | |
| theme(panel.background=element_blank(), | |
| line=element_blank(), | |
| axis.text=element_blank(), | |
| axis.title=element_blank(), | |
| plot.margin=unit(c(0,0,0,0), "lines"), | |
| complete=TRUE) | |
| ## add a marginal histogram of each pixel | |
| ggExtra::ggMarginal(p, type="histogram", binwidth=1, col="darkturquoise") |
It leverages the png package to extract the channels into a matrix, converts those to x,y,z data.frames, takes the median value, plots that with ggplot2, then leverages ggExtra::ggMarginal to add the marginal histograms. Note that the ggExtra package has some bugs (it hasn’t been maintained in a while) in relation to more recent (possibly the dev branch) of ggplot2. I got it working on at least one of my machines. This is my result

I’ve had several uses for these types of marginal plots lately, so hopefully I can sort out the issues I’ve been getting in combination with ggplot2.