Commit Logs

Scala Pattern Matching

Pattern matching is one of the most used features in Scala. Imagine yourself in a situation where you need to write a series of if-else-if statements to analyze the arguments of a certain function. It sometimes just a bit annoying to have those extra space/parentheses and you may hope to have something more elegent. Well, pattern matching is here for the rescue (some smarty pants may think of hash map, more on this later). It is like those switch statement in C/C++/Java, but better and more powerful.

A pattern match includes a sequence of alternatives, each includes a pattern defined by the keyword case and one or more expressions that will be evaluated if the pattern matches. In this post, we will see some of its use cases and best practices.

Simple pattern matching

Simple pattern matching is very similiar to the common use cases of the switch statements in other language, i.e. each pattern is of the same type. A toy function that takes an integer and outputs its corresponding month in string format may look like this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def intToMonth(i: Int): String = i match {
case 1 => "Jan"
case 2 => "Feb"
case 3 => "Mar"
case 4 => "Apr"
case 5 => "May"
case 6 => "Jun"
case 7 => "Jul"
case 8 => "Aug"
case 9 => "Sep"
case 10 => "Oct"
case 11 => "Nov"
case 12 => "Dec"
case _ => throw new Error("Invalid Month")
}

intToMonth(3)
// res0: String = Mar

intToMonth(13)
// java.lang.Error: Invalid Month

This toy example should be quite straigtforward. Note that the case _ is the catch-all expression. Depending on specific use cases, you may choose to handle the un-matched patterns differently. If you don’t specify this catch-all expression, a generic MatchError is thrown, which is not recommended.

Pattern matching in Scala is less error-prone than the switch statement in C/C++, since it doesn’t suffer from the “fall-through” problem, i.e. an explicit break at the end of each branch is required to prevent falling through to the next branch.

Another common use case is to parse options of an application. The returned type is Unit since each expression calls another function that is used for its side effect, i.e. display some message at the command line.

1
2
3
4
5
def parseOption(opt: String): Unit = opt match {
case "-h" | "--help" => displayHelp
case "-v" | "--version" => displayVersion
case invalidOption => displayInvalidOption(invalidOption)
}

Note that, when you need to use the catch-all value in expression, you can’t access it with the wildcard syntax _. Instead, you would need to assign a variable name to this default value.

Match patterns of integers

Let’s take another look at our intToMonth example. For a simple match expression like this, there is a particular optimization technique available. Scala provides the @switch annotation that compiles the pattern matching into a tableswitch or lookupswitch, which is better for performance. In these scenarios, a value can jump directly to the result rather walking through a decision tree when this optimization is not available.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def intToMonth(i: Int): String = (i: @switch) match {
case 1 => "Jan"
case 2 => "Feb"
case 3 => "Mar"
case 4 => "Apr"
case 5 => "May"
case 6 => "Jun"
case 7 => "Jul"
case 8 => "Aug"
case 9 => "Sep"
case 10 => "Oct"
case 11 => "Nov"
case 12 => "Dec"
case _ => throw new Error("Invalid Month")
}

To use this optimization, a few conditions have to be met as stated here.

  1. The matched value must be a known integer.
  2. The matched expression must be “simple”. It can’t contain any type checks, if statements, or extractors.
  3. The expression must also have its value available at comile time.
  4. There should be more than two case statements.

Hash map

To achieve a similar level of optimization for a single typed pattern, you don’t actually have to use pattern matching. In most cases, a hash map would do the trick.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
val intToMonthMap: Map[Int, String] = Map(
1 -> "Jan",
2 -> "Feb",
3 -> "Mar",
4 -> "Apr",
5 -> "May",
6 -> "Jun",
7 -> "Jul",
8 -> "Aug",
9 -> "Sep",
10 -> "Oct",
11 -> "Nov",
12 -> "Dec"
)

intToMonthMap(3)
// res0: String = Mar

Of course, hash map has its own limitations and some over-head around maintaining another variable. That’s why pattern matching is generally more powerful as you will see from the next few sections.

Match patterns of sequences

In a slightly more complicated scenario, you may also want to match against sequences, such as List or Array. I say they are more complicated, since there are two-dimensions to be matched, i.e. the value and the length of the sequence. Let’s look at this toy example to match patterns for a list of integers.

1
2
3
4
5
6
def matchList(list: List[Int]): Unit = list match {
case List(1, _, _) => println("A list with three integers and starts with 1")
case List(1, _) => println("A list with two integers and starts with 1")
case List(1, _*) => println("A list starts with 1 and has any number of integers")
case _ => println("None of the above")
}

The wildcards are pretty powerful. Here, we use _ to stand for each element in a sequence and _* for zero or more elements. Note that, in the above example, case List(1, _*) should match all lists starting with one and having any number of integers by its own, but, combined with other patterns, it actually matches all lists starting with one and having more than three elements of integers.

Match typed patterns

Besides the simple use cases, pattern matching can be used to handle typed patterns. That is, you could use it to detect the type of the input before you could deal with each type correctly.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
def stringifyInput(x: Any): String = x match {
case i: Int => s"Input Integer: $i"
case d: Double => s"Inpit Double: $d"
case s: String => s"Input String: $s"
case ai: Array[Int] => s"Input Array of Int: ${ai.mkString(",")}"
case as: Array[String] => s"Input Array of String: ${as.mkString(",")}"
}

stringifyInput(3)
// res0: String = Input Integer: 3

stringifyInput(3.0)
// res1: String = Input Double: 3.0

stringifyInput("s")
// res2: String = Input String: s

stringifyInput(Array(2, 3))
// res3: String = Input Array of Int: 2,3

stringifyInput(Array("s", "t"))
// res4: String = Input Array of String: s,t

This code pattern is used to replace a if-isInstanceOf combination and is useful to walk through a structure using the composite design pattern.

Note that, since the pattern match occurs at runtime and generic types are erased in the JVM, one cannot make a type match for a specific Map type.

1
case m: Map[String, Int] => ... // this dosen't work

Match case class

More generally, we can use case class in pattern matching for complex data structures. I find this to be one of the most common use cases of pattern matching in reality, although its syntax is quite similar to what we have already seen.

1
2
3
4
5
6
7
8
9
10
11
12
13
case class Person(name: String, age: Int, isMarried: Boolean)

def matchMarriageStatus(p: Person): Unit = p match {
case Person(name, age, true) => println(name + " age " + age.toString + " is married.")
case Person(name, age, false) => println(name + " age " + age.toString + " is not married.")
case _ => println("Unknown Marriage Status.")
}

matchMarriageStatus(Person("Tom", 32, true))
// Tom age 32 is married.

matchMarriageStatus(Person("James", 22, true))
// James age 22 is not married.

You can get more advanced with case class and case object for pattern matching. For example, just like what we do with matching typed patterns, some case class may extend a common abstrac class and you can easily handle each of them by its specific logic.

Add guards to case statements

Additional qualifying logic can also be easily added to the case statement via if expressions. For an arbitrary example, if you only want to stringify numeric values fall in between some ranges.

1
2
3
4
5
6
def stringifyInputWithConditions(x: Any): String = x match {
case i: Int if 0 to 10 contains i => s"Input Integer: $i"
case d: Double if 5 to 10 contains d => s"Inpit Double: $d"
case l: Long if l == 10L => s"Input Long: $l"
case _ => s"Invalid Input"
}

These conditions in guards, i.e. the if expressions, can be fields extracted from a case class, which turns out to be a very handy feature in practice.

Ending

In this post, we went through some common use cases of pattern matching. Hopefully you are convinced, or at least start to think, that pattern matching is a quite powerful feature in Scala. And, because of that, I can’t possibly cover all of its use cases here. More importantly, you will definitely find it to be widely used in many projects. It helps to make complex logic written in a concise and readable syntax.