I owe a lot to Jeff Leek and Roger Peng for their great Coursera courses, in which I learned to program in R.
They (along with Rafa Irizarry) run the Simply Statistics (simplystatistics.org) blog, which I highly reccomend. They posted a Thanksgiving puzzle in which a data.frame needs to be converted from one form to another, spelling out ‘thanksgiving’.
http://simplystatistics.org/2015/11/25/a-thanksgiving-dplyr-rubiks-cube-puzzle-for-you/ (simplystatistics.org)
The puzzle: convert this
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
col3 col2 col4 col1 | |
h a t t | |
i v i g | |
k s g n | |
n g n i |
into this
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
col1 col3 col2 | |
t h a | |
n k s | |
g i v | |
i n g |
My solution, which uses Rubik’s Cube rotations of rows and columns (and dplyr of course):
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## transform the input to the output using rubiks-cube transformations and dplyr | |
require(dplyr) | |
## provided input | |
input <- data.frame(matrix(c("h","a","t","t","i","v","i","g","k","s","g","n","n","g","n","i"), | |
4, | |
4, | |
byrow=TRUE, | |
dimnames=list(NULL,paste0("col",1:4))), | |
stringsAsFactors=FALSE) | |
## requested output | |
output <- data.frame(matrix(c("t","h","a","n","k","s","g","i","v","i","n","g"), | |
4, | |
3, | |
byrow=TRUE, | |
dimnames=list(NULL,paste0("col",1:3))), | |
stringsAsFactors=FALSE) | |
## construct the column rotating function using dplyr commands | |
## taking care of wrap-around | |
## allows forwards or backwards rotation through + or - operator | |
rotate_col <- function(df, colnum) { | |
if(colnum > 0) { | |
## rotate the cube 1->4 by 1 step (row3->row4, row2->row3, etc...) | |
## row4->row1 | |
rotdfcol <- df %>% slice(c(2,3,4,1)) %>% select(colnum) | |
} else { | |
## rotate the cube 4->1 by 1 step (row4->row3, row3->row2, etc...) | |
## row1->row4 | |
colnum <- -colnum | |
rotdfcol <- df %>% slice(c(4,1,2,3)) %>% select(colnum) | |
} | |
newdf <- df | |
newdf[, colnum] <- rotdfcol | |
return(newdf) | |
} | |
## construct the row rotating function using dplyr commands | |
## using the same conventions as rotate_col | |
rotate_row <- function(df, rownum) { | |
if (rownum > 0) { | |
rotdfrow <- df %>% select(c(4,1,2,3)) %>% slice(rownum) | |
} else { | |
rownum <- -rownum | |
rotdfrow <- df %>% select(c(2,3,4,1)) %>% slice(rownum) | |
} | |
newdf <- df | |
newdf[rownum, ] <- rotdfrow | |
return(newdf) | |
} | |
## perform rubiks-cube operations to achieve the desired result, utilising row 4 as free spaces | |
## in no way guaranteed to be the fastest method, just the first one that worked | |
input %>% | |
rotate_row(+1) %>% | |
rotate_col(+4) %>% rotate_col(+4) %>% | |
rotate_col(-1) %>% rotate_row(+4) %>% rotate_col(-1) %>% rotate_row(-4) %>% rotate_col(+1) %>% rotate_col(+1) %>% | |
rotate_col(-2) %>% rotate_row(+4) %>% rotate_col(-2) %>% rotate_row(-4) %>% rotate_col(+2) %>% rotate_col(+2) %>% | |
rotate_col(-3) %>% rotate_row(-4) %>% rotate_col(-3) %>% rotate_row(+4) %>% rotate_col(+3) %>% rotate_col(+3) %>% | |
rotate_row(+4) %>% rotate_col(-1) %>% rotate_row(-4) %>% rotate_col(+1) %>% | |
rotate_col(-2) %>% rotate_row(+4) %>% rotate_col(+2) %>% | |
rotate_row(+4) %>% rotate_col(-3) %>% rotate_row(-4) %>% rotate_col(+3) %>% | |
rotate_col(-4) %>% rotate_row(+4) %>% rotate_col(+4) %>% | |
slice(c(1,2,3)) -> solved ## still a 3x4 matrix | |
## re-shape the solution into a 4x3 matrix | |
final <- solved %>% | |
t %>% | |
c %>% | |
matrix(4, | |
3, | |
byrow=TRUE, | |
dimnames=list(NULL,paste0("col",1:3))) %>% | |
data.frame(stringsAsFactors=FALSE) | |
## check the solution is correct | |
identical(final, output) # TRUE |
Suggestions on how I could have done this differently (or automated solutions) most welcome!