I’ve long been interested in exactly how R works - not quite enough for me to learn all the internals, but I was surprised that I could not find a clear guide towards exactly how vectorization works at the deepest level.
Let’s say we want to add two vectors which we’ve defined as x
and y
x <- c(2, 4, 6)
y <- c(1, 3, 2)
One way to do this (the verbose, elementwise way) would be to add each pair of elements
c(x[1] + y[1], x[2] + y[2], x[3] + y[3])
## [1] 3 7 8
but if you are familiar with not repeating yourself, you might write this in a loop. Best practice involves pre-filling the result to the correct size
res <- c(NA, NA, NA)
for (i in 1:3) {
res[i] <- x[i] + y[i]
}
res
## [1] 3 7 8
There, we wrote a for
loop.
Now, if you’ve ever seen R tutorials or discussions, you’ve probably had it drilled into you that
this is “bad” and should be replaced with something else. One of those options is an apply
family
function
plus <- function(i) {
x[i] + y[i]
}
sapply(1:3, plus)
## [1] 3 7 8
or something ‘tidier’
purrr::map_dbl(1:3, plus)
## [1] 3 7 8
(yes, yes, these functions are ‘bad’ because they don’t take x
or y
as arguments) but this stil
performs the operation elementwise. If you’ve seen even more R tutorais/discussions you’ve
probably been seen that vectorization is very handy - The R function +
knows what to do with objects that
aren’t just a single value, and does what you might expect
x + y
## [1] 3 7 8
Now, if you’ve really read a lot about R, you’ll know that ‘under the hood’ a for
-loop
is involved in every one of these, but it’s “lower down”, “at the C level”. Jenny Bryan makes the
point that “Of course someone has to write loops / It doesn’t have to be you” and for this reason,
vectorization in R is of great benefit.
So, there is a loop, but where exactly does that happen?
At some point, the computer needs to add the elements of x
to the elements of y
, and the simplest versions of this
happens one element at a time, in a loop. There’s a big sidetrack here about SIMD
which I’ll try to avoid, but I will mention that the Microsoft fork of R (artist, formerly known as Revolution R) running on Intel chips can do SIMD in MKL.
So, let’s start at the operator.
`+`
## function (e1, e2) .Primitive("+")
Digging into primitives is a little tricky, but {pryr}
can help
pryr::show_c_source(.Primitive("+"))
+ is implemented by do_arith with op = PLUSOP
We can browse a copy of the source for do_arith
(in arithmetic.c) here where we see some
logic paths for scalars and vectors. Let’s assume we’re working with our example which has length(x) == length(y) > 1
. With two non-scalar arguments
if !IS_SCALAR and argc == length(arg) == 2
This leads us to call R_binary
Depending on the class of the arguments, we need to call different functions, but for the sake of our example let’s say we have non-integer real numbers so we fork to real_binary
. This takes a code
argument for which type of operation we’re performing, and in our case it’s PLUSOP
(noted above). There’s a case
branch for this in which case, provided the arguments are of the same length (n1 == n2
) we call
R_ITERATE_CHECK(NINTERRUPT, n, i, da[i] = dx[i] + dy[i];);
That’s starting to look a lot like a loop - there’s an iterator i
and we’re going to call another function.
This jumps us over to a different file where we see LOOP_WITH_INTERRUPT_CHECK
definitely performs some sort of loop. This takes the body above and the argument LOOP_ITERATE_CORE
which is finally the actual loop!
#define R_ITERATE_CORE(n, i, loop_body) do { \
for (; i < n; ++i) { loop_body } \
} while (0)
so, that’s where the actual loop in a vectorized R call happens! ALL that sits behind the innocent-looking +
.
That was thoroughly satisfying, but I did originally have in mind comparing R to another language - one where loops aren’t frowned upon because of performance, but rather encouraged… How do Julia loops differ?
Julia is not a vectorized language per se, but it has a neat ability to “vectorize” any operation, though in Julia syntax it’s “broadcasting”.
Simple addition can combine scalar values
3+4
## 7
Julia actually has scalar values (in R, even a single value is just a vector of length 1) so a single value could be
typeof(3)
## Int64
whereas several values need to be an Array
, even if it only has 1 dimension
Vector{Int64}([1, 2, 3])
## 3-element Array{Int64,1}:
## 1
## 2
## 3
Trying to add two Arrays
does work
[1, 2, 3] + [4, 5, 6]
## 3-element Array{Int64,1}:
## 5
## 7
## 9
but only because a specific method has been written for this case, i.e.
methods(+, (Array, Array))
## # 1 method for generic function "+":
## [1] +(A::Array, Bs::Array...) in Base at arraymath.jl:43
One thing I particularly like is that we can see exactly which method was called using the @which
macro
@which [1, 2, 3, 4] + [1, 2, 3, 4]
+(A::Array, Bs::Array...) in Base at arraymath.jl:43
something that I really wish was easier to do in R. The @edit
macro even jumps us right into the actual code for this dispatched call.
This ‘add vectors’ problem can be solved through broadcasting, which performs an operation elementwise
[1, 2, 3] .+ [4, 5, 6]
## 3-element Array{Int64,1}:
## 5
## 7
## 9
The fun fact about this I recently learned was that broadcasting works on any operation, even if that’s the pipe itself
["a", "list", "of", "strings"] .|> [uppercase, reverse, titlecase, length]
## 4-element Array{Any,1}:
## "A"
## "tsil"
## "Of"
## 7
Back to our loops, the method for +
on two Array
s points us to arraymath.jl (linked to current relevant line) which contains
function +(A::Array, Bs::Array...)
for B in Bs
promote_shape(A, B) # check size compatibility
end
broadcast_preserving_zero_d(+, A, Bs...)
end
The last part of that is the meaningful part, and that leads to Broadcast.broadcast_preserving_zero_d
.
This starts to get out of my depth, but essentially
@inline function broadcast_preserving_zero_d(f, As...)
bc = broadcasted(f, As...)
r = materialize(bc)
return length(axes(bc)) == 0 ? fill!(similar(bc, typeof(r)), r) : r
end
@inline broadcast(f, t::NTuple{N,Any}, ts::Vararg{NTuple{N,Any}}) where {N} = map(f, t, ts...)
involves a map
operation to achieve the broadcasting.
Perhaps that’s a problem to tackle when I’m better at digging through Julia.
As always, comments, suggestions, and anything else welcome!
devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.1.2 (2021-11-01)
## os Pop!_OS 21.04
## system x86_64, linux-gnu
## ui X11
## language en_AU:en
## collate en_AU.UTF-8
## ctype en_AU.UTF-8
## tz Australia/Adelaide
## date 2022-04-22
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date lib source
## blogdown 1.8 2022-02-16 [1] CRAN (R 4.1.2)
## bookdown 0.24 2021-09-02 [1] CRAN (R 4.1.2)
## brio 1.1.1 2021-01-20 [3] CRAN (R 4.0.3)
## bslib 0.3.1 2021-10-06 [1] CRAN (R 4.1.2)
## cachem 1.0.3 2021-02-04 [3] CRAN (R 4.0.3)
## callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.2)
## cli 3.2.0 2022-02-14 [1] CRAN (R 4.1.2)
## crayon 1.5.0 2022-02-14 [1] CRAN (R 4.1.2)
## desc 1.4.1 2022-03-06 [1] CRAN (R 4.1.2)
## devtools 2.4.3 2021-11-30 [1] CRAN (R 4.1.2)
## digest 0.6.27 2020-10-24 [3] CRAN (R 4.0.3)
## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.2)
## evaluate 0.14 2019-05-28 [3] CRAN (R 4.0.1)
## fastmap 1.1.0 2021-01-25 [3] CRAN (R 4.0.3)
## fs 1.5.0 2020-07-31 [3] CRAN (R 4.0.2)
## glue 1.6.1 2022-01-22 [1] CRAN (R 4.1.2)
## htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.2)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.2)
## jsonlite 1.7.2 2020-12-09 [3] CRAN (R 4.0.3)
## JuliaCall * 0.17.4 2021-05-16 [1] CRAN (R 4.1.2)
## knitr 1.37 2021-12-16 [1] CRAN (R 4.1.2)
## lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.2)
## magrittr 2.0.1 2020-11-17 [3] CRAN (R 4.0.3)
## memoise 2.0.0 2021-01-26 [3] CRAN (R 4.0.3)
## pkgbuild 1.2.0 2020-12-15 [3] CRAN (R 4.0.3)
## pkgload 1.2.4 2021-11-30 [1] CRAN (R 4.1.2)
## prettyunits 1.1.1 2020-01-24 [3] CRAN (R 4.0.1)
## processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.2)
## ps 1.5.0 2020-12-05 [3] CRAN (R 4.0.3)
## purrr 0.3.4 2020-04-17 [3] CRAN (R 4.0.1)
## R6 2.5.0 2020-10-28 [3] CRAN (R 4.0.2)
## Rcpp 1.0.6 2021-01-15 [3] CRAN (R 4.0.3)
## remotes 2.4.2 2021-11-30 [1] CRAN (R 4.1.2)
## rlang 1.0.1 2022-02-03 [1] CRAN (R 4.1.2)
## rmarkdown 2.13 2022-03-10 [1] CRAN (R 4.1.2)
## rprojroot 2.0.2 2020-11-15 [3] CRAN (R 4.0.3)
## rstudioapi 0.13 2020-11-12 [3] CRAN (R 4.0.3)
## sass 0.4.0 2021-05-12 [1] CRAN (R 4.1.2)
## sessioninfo 1.1.1 2018-11-05 [3] CRAN (R 4.0.1)
## stringi 1.5.3 2020-09-09 [3] CRAN (R 4.0.2)
## stringr 1.4.0 2019-02-10 [3] CRAN (R 4.0.1)
## testthat 3.1.2 2022-01-20 [1] CRAN (R 4.1.2)
## usethis 2.1.5 2021-12-09 [1] CRAN (R 4.1.2)
## withr 2.5.0 2022-03-03 [1] CRAN (R 4.1.2)
## xfun 0.30 2022-03-02 [1] CRAN (R 4.1.2)
## yaml 2.2.1 2020-02-01 [3] CRAN (R 4.0.1)
##
## [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.1
## [2] /usr/local/lib/R/site-library
## [3] /usr/lib/R/site-library
## [4] /usr/lib/R/library