I’ve been working my way through Exercism exercises in a variety of languages because I strongly believe every language you learn something about teaches you about all the others you know, and makes for useful comparisons between what features they offer. I was1 Learning Me a Haskell for Great Good (there’s a guide/book by that name) and something about Pattern Matching just seemed extremely familiar.

Pattern Matching is sort of like a case statement, but rather than just comparing literal values against some enum, it takes into consideration how the input “looks”. A simple example is to match against either an empty list [] (just that; an empty list) or a non-empty list denoted (x:xs). In Haskell, : is a concatenation operator (cons in lisp) so this is the concatenation of x and the rest of a list, xs. The wildcard pattern _ matching “whatever”.

A map function definition (from here) is then

map _ []     = []
map f (x:xs) = f x : map f xs

This is two definitions for map, depending on which pattern is provided as the two arguments. The first takes “whatever” (doesn’t matter, is ignored) and an empty list and just returns an empty list. The second takes some function f and a non-empty list, and concatenates (:) (f x) (the first element of the list x provided to the function f) with map f xs (the result of providing f and the rest of the list xs to map, recursively).

Since Haskell is strongly typed, I don’t think this can be used to define the same named function for different types, but it can certainly do something different depending on the pattern of the data. In this example, if the argument is an empty list, return 0; if the argument is a length-1 list (arg1 concatenated with an empty list) then return arg1 * 100, and if the argument is a longer list, return the product of the first element and the second. This then prints out calling fun 5.0 and fun [5.0, 5.0]

fun :: [Float] -> Float
fun [] = 0.0
fun (arg1:[]) = arg1 * 100.0
fun (arg1:arg2) = arg1 * (head arg2)

main = do
print (fun [5.0])
print (fun [5.0, 5.0])
500.0
25.0

Woo! A different function called depending on the input. I believe it might be possible to actually have optional arguments via the Data.Maybe package but I couldn’t get it to compile an example the way I wanted2.

Rust has something similar but more specific to a case statement; a match expression can take patterns as options and return whichever matches (example from here)

fn main() {
let input = 's';

match input {
'q'                   => println!("Quitting"),
'a' | 's' | 'w' | 'd' => println!("Moving around"),
'0'..='9'             => println!("Number input"),
_                     => println!("Something else"),
}
}
Moving around

Another common use of match is to switch between the enums Some and None or Ok and Err (see here).

The familiarity of the Haskell pattern matching / function definition took me back to one of the very first programming ‘tricks’ I learned way back in the late 2000’s working on my PhD, using Fortran; “function overloading”. I wasn’t formally taught programming at all (an oversight, given how important it became to doing my research), so I just had to pick up bits and pieces from people who knew more.

I had a bunch of integration routines3 which were slightly different depending on whether or not the limits were finite4, so I had to call the right one with various if statements. The ‘trick’ I was taught was to use INTERFACE / MODULE PROCEDURE blocks to “dispatch” depending on the function signature, or at least the number of arguments. This meant that I could just call integrate regardless of whether it was a signature with 4 arguments, or a signature with an additional argument if a bound was Infty.

A “small” (Fortran is hardly economical with page real-estate) example of this, following the Haskell example, defines two functions Fun1arg and Fun2arg which can be consolidated into fun with the INTERFACE block. Calling fun(x) or fun(x, y) is routed to the function with the relevant signature.

MODULE exampleDispatch
IMPLICIT NONE

INTERFACE fun
MODULE PROCEDURE Fun1arg, Fun2arg
END INTERFACE fun

CONTAINS

! A function that takes one argument
! and multiplies it by 100
REAL FUNCTION Fun1arg(arg1)
IMPLICIT NONE
REAL, INTENT( IN ) :: arg1
Fun1arg = arg1 * 100.0
END FUNCTION Fun1arg

! A function that takes two arguments
! and multiplies them
REAL FUNCTION Fun2arg(arg1, arg2)
IMPLICIT NONE
REAL, INTENT( IN ) :: arg1, arg2
Fun2arg = arg1 * arg2
END FUNCTION Fun2arg

END MODULE exampleDispatch

PROGRAM dispatch

USE exampleDispatch

IMPLICIT NONE
REAL :: a = 5.0
REAL :: fun

PRINT *, fun(a)
PRINT *, fun(a, a)

END PROGRAM dispatch
   500.000000
25.0000000

That takes me back! I’m going to dig out my old research code and get it into GitHub for posterity. I’m also going to do the Fortran exercises in Exercism to reminisce some more.

So, not quite the same as the Haskell version, but it got me thinking about dispatch. R has several approaches. The most common is S3 in which dispatch occurs based on the class of the first argument to a function, so you can have something different happen to a data.frame argument and a tibble argument, but in both cases the signature has the same “shape” - only the types vary.

Wiring that up to work differently with a list and any other value (the default case, which would break for anything that doesn’t vectorize, but it’s a toy example) looks like

fun <- function(x) {
UseMethod("fun")
}

fun.default <- function(x) {
x * 100
}

fun.list <- function(x) {
x[[1]] * x[[2]]
}

fun(5)
fun(list(5, 5))
[1] 500
[1] 25

Another option is to use S4 which is more complicated but more powerful. Here, dispatch can occur based on the entire signature, though (and I may be wrong) I believe that, too, still needs to have a consistent “shape”. A fantastic guide to S4 is Stuart Lee’s post here.

A S4 version of my example could have two options for the signature; one where both x and y are "numeric", and another where y is "missing". "ANY" would also work and encompass a wider scope.

setGeneric("fun", function(x, y, ...) standardGeneric("fun"))

setMethod("fun", c("numeric", "missing"), function(x, y) {
x * 100
})

setMethod("fun", c("numeric", "numeric"), function(x, y) {
x * y
})

fun(5)
fun(5, 5)
[1] 500
[1] 25

So, can we ever do what I was originally inspired to do - write a simple definition of a function that calculates differently depending on the number of arguments? Aha - Julia to the rescue!! Julia has a beautifully simple syntax for defining methods on signatures: just write it out!

fun(x) = x * 100
fun(x, y) = x * y

println(fun(5))
println(fun(5, 5))
500
25

That’s two different signatures for fun computing different things, and a lot less boilerplate compared to the other languages, especially Fortran. What’s written above is the entire script. You can even go further and be specific about the types, say, mixing Int and Float64 definitions

fun(x::Int) = x * 100
fun(x::Float64) = x * 200

fun(x::Int, y::Int) = x * y
fun(x::Int, y::Float64) = x * y * 2

println(fun(5))
println(fun(5.))
println(fun(5, 5))
println(fun(5, 5.))
500
1000.0
25
50.0

It doesn’t get simpler or more powerful than that!!

I’ve added all these examples to a repo split out by language, and some instructions for running them (assuming you have the language tooling already set up).

Do you have another example from a language that does this (well? poorly?) or similar? Leave a comment if you have one, or find me on Mastodon

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.1.2 (2021-11-01)
##  os       Pop!_OS 22.04 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_AU.UTF-8
##  ctype    en_AU.UTF-8
##  date     2023-06-17
##  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  blogdown      1.17    2023-05-16 [1] CRAN (R 4.1.2)
##  bookdown      0.29    2022-09-12 [1] CRAN (R 4.1.2)
##  bslib         0.4.1   2022-11-02 [3] CRAN (R 4.2.2)
##  cachem        1.0.6   2021-08-19 [3] CRAN (R 4.2.0)
##  callr         3.7.3   2022-11-02 [3] CRAN (R 4.2.2)
##  cli           3.4.1   2022-09-23 [3] CRAN (R 4.2.1)
##  crayon        1.5.2   2022-09-29 [3] CRAN (R 4.2.1)
##  devtools      2.4.5   2022-10-11 [1] CRAN (R 4.1.2)
##  digest        0.6.30  2022-10-18 [3] CRAN (R 4.2.1)
##  ellipsis      0.3.2   2021-04-29 [3] CRAN (R 4.1.1)
##  evaluate      0.18    2022-11-07 [3] CRAN (R 4.2.2)
##  fastmap       1.1.0   2021-01-25 [3] CRAN (R 4.2.0)
##  fs            1.5.2   2021-12-08 [3] CRAN (R 4.1.2)
##  glue          1.6.2   2022-02-24 [3] CRAN (R 4.2.0)
##  htmltools     0.5.3   2022-07-18 [3] CRAN (R 4.2.1)
##  htmlwidgets   1.5.4   2021-09-08 [1] CRAN (R 4.1.2)
##  httpuv        1.6.6   2022-09-08 [1] CRAN (R 4.1.2)
##  jquerylib     0.1.4   2021-04-26 [3] CRAN (R 4.1.2)
##  jsonlite      1.8.3   2022-10-21 [3] CRAN (R 4.2.1)
##  knitr         1.40    2022-08-24 [3] CRAN (R 4.2.1)
##  later         1.3.0   2021-08-18 [1] CRAN (R 4.1.2)
##  lifecycle     1.0.3   2022-10-07 [3] CRAN (R 4.2.1)
##  magrittr      2.0.3   2022-03-30 [3] CRAN (R 4.2.0)
##  memoise       2.0.1   2021-11-26 [3] CRAN (R 4.2.0)
##  mime          0.12    2021-09-28 [3] CRAN (R 4.2.0)
##  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.1.2)
##  pkgbuild      1.4.0   2022-11-27 [1] CRAN (R 4.1.2)
##  pkgload       1.3.0   2022-06-27 [1] CRAN (R 4.1.2)
##  prettyunits   1.1.1   2020-01-24 [3] CRAN (R 4.0.1)
##  processx      3.8.0   2022-10-26 [3] CRAN (R 4.2.1)
##  profvis       0.3.7   2020-11-02 [1] CRAN (R 4.1.2)
##  promises      1.2.0.1 2021-02-11 [1] CRAN (R 4.1.2)
##  ps            1.7.2   2022-10-26 [3] CRAN (R 4.2.2)
##  purrr         1.0.1   2023-01-10 [1] CRAN (R 4.1.2)
##  R6            2.5.1   2021-08-19 [3] CRAN (R 4.2.0)
##  Rcpp          1.0.9   2022-07-08 [1] CRAN (R 4.1.2)
##  remotes       2.4.2   2021-11-30 [1] CRAN (R 4.1.2)
##  rlang         1.0.6   2022-09-24 [1] CRAN (R 4.1.2)
##  rmarkdown     2.18    2022-11-09 [3] CRAN (R 4.2.2)
##  rstudioapi    0.14    2022-08-22 [3] CRAN (R 4.2.1)
##  sass          0.4.2   2022-07-16 [3] CRAN (R 4.2.1)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.1.2)
##  shiny         1.7.2   2022-07-19 [1] CRAN (R 4.1.2)
##  stringi       1.7.8   2022-07-11 [3] CRAN (R 4.2.1)
##  stringr       1.5.0   2022-12-02 [1] CRAN (R 4.1.2)
##  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.1.2)
##  usethis       2.1.6   2022-05-25 [1] CRAN (R 4.1.2)
##  vctrs         0.5.2   2023-01-23 [1] CRAN (R 4.1.2)
##  xfun          0.34    2022-10-18 [3] CRAN (R 4.2.1)
##  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.1.2)
##  yaml          2.3.6   2022-10-18 [3] CRAN (R 4.2.1)
##
##  [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.1
##  [2] /usr/local/lib/R/site-library
##  [3] /usr/lib/R/site-library
##  [4] /usr/lib/R/library
##
## ──────────────────────────────────────────────────────────────────────────────

1. in part due to a strong representation of Haskell at my local Functional Programming Meetup↩︎

2. I’m highly likely doing something wrong - I never wrote any Haskell before last week↩︎

3. Numerical Recipes in Fortran 90 was about the most important book we had for writing code, basically nothing else was trusted - getting a digital copy of the code was considered a sign of true power↩︎

4. what, you don’t have to integrate up to infinity in your code?↩︎