CS 251: Tail Recursion

Tail recursion is an idiom for writing certain kinds of recursive functions. It does not require any new language features, but many implementations of functional languages like Racket or ML do recognize and optimize the use of tail recursion. Tail-recursive functions often use the accumulator pattern to transform a natural recursive function into tail-recursive form.

To understand tail recursion and accumulators, consider these functions for summing the elements of a list:

(define (sum1 xs)
  (if (null? xs)
      0
      (+ (car xs) (sum (cdr xs)))))

(define (sum2 xs)
  (define (sum-tail acc xs)
    (if (null? xs)
        acc
        (sum-tail (+ (car xs) acc) (cdr xs))))
  (sum-tail 0 xs))

Both functions compute the same results, but sum2 is more complicated, using a local helper function that takes an extra argument, called acc for “accumulator.” In the base case of sum-tail we return acc and the value passed for the outermost call is 0, the same value used in the base case of sum1. This pattern is common: The base case in the non-accumulator style becomes the initial accumulator and the base case in the accumulator style just returns the accumulator.

Why might \verb|sum2| be preferred when it is clearly more complicated? To answer, we need to understand a little bit about how function calls are implemented. Conceptually, there is a \emph{call stack}, which is a stack (the data structure with push and pop operations) with one element for each function call that has been started but has not yet completed. Each element stores things like the value of local variables and what part of the function has not been evaluated yet. When the evaluation of one function body calls another function, a new element is pushed on the call stack and it is popped off when the called function completes.

So for \verb|sum1|, there will be one call-stack element (sometimes just called a stack frame'') for each recursive call to \verb|sum1|, i.e., the stack will be as big as the list. This is necessary because after each stack frame is popped off the caller has to,do the rest of the body’’ — namely add \verb|i| to the recursive result and return.

Given the description so far, \verb|sum2| is no better: \verb|sum2| makes a call to \verb|f| which then makes one recursive call for each list element. However, when \verb|f| makes a recursive call to \verb|f|, \emph{there is nothing more for the caller to do after the callee returns except return the callee’s result}. This situation is called a \emph{tail call} (let’s not try to figure out why it’s called this) and functional languages like ML typically promise an essential optimization: When a call is a tail call, the caller’s stack-frame is popped \emph{before} the call — the callee’s stack-frame just \emph{replaces} the caller’s. This makes sense: the caller was just going to return the callee’s result anyway. Therefore, calls to \verb|sum2| never use more than 1 stack frame.

Why do implementations of functional languages include this optimization? By doing so, recursion can sometimes be as efficient as a while-loop, which also does not make the call-stack bigger. The ``sometimes’’ is exactly when calls are tail calls, something you the programmer can reason about since you can look at the code and identify which calls are tail calls.

Tail calls do not need to be to the same function (\verb|f| can call \verb|g|), so they are more flexible than while-loops that always have to call'' the same loop. Using an accumulator is a common way to turn a recursive function into atail-recursive function’’ (one where all recursive calls are tail calls), but not always. For example, functions that process trees (instead of lists) typically have call stacks that grow as big as the depth of a tree, but that’s true in any language: while-loops are not very useful for processing trees.

\section{More Examples of Tail Recursion}

Tail recursion is common for functions that process lists, but the concept is more general. For example, here are two implementations of the factorial function where the second one uses a tail-recursive helper function so that it needs only a small constant amount of call-stack space: \begin{verbatim} fun fact1 n = if n=0 then 1 else n * fact1(n-1)

fun fact2 n = let fun aux(n,acc) = if n=0 then acc else aux(n-1,accn) in aux(n,1) end \end{verbatim} It is worth noticing that \verb|fact1 4| and \verb|fact2 4| produce the same answer even though the former performs $4 * (3 * (2 * (1 * 1)))$ and the latter performs $(((1 * 4) * 3) * 2) * 1$. We are relying on the fact that multiplication is associative ($a(bc)=(ab)c$) and that multiplying by 1 is the identity function ($1x=x*1=x$). The earlier \verb|sum| example made analogous assumptions about addition. In general, converting a non-tail-recursive function to a tail-recursive function usually needs associativity, but many functions are associative.

A more interesting example is this inefficient function for reversing a list: \begin{verbatim} fun rev1 lst = case lst of [] => [] | x::xs => (rev1 xs) @ [x] \end{verbatim} We can recognize immediately that it is not tail-recursive since after the recursive call it remains to append the result onto the one-element list that holds the head of the list. Although this is the most natural way to reverse a list recursively, the inefficiency is caused by more than creating a call-stack of depth equal to the argument’s length, which we will call $n$. The worse problem is that the total amount of work performed is proportional to $n^2$, i.e., this is a quadratic algorithm. The reason is that appending two lists takes time proportional to the length of the first list: it has to traverse the first list — see our own implementations of append discussed previously. Over all the recursive calls to \verb|rev1|, we call \verb|@| with first arguments of length $n-1,n-2,…,1$ and the sum of the integers from $1$ to $n-1$ is $n*(n-1)/2$.

As you learn in a data structures and algorithms course, quadratic algorithms like this are much slower than linear algorithms for large enough $n$. That said, if you expect $n$ to always be small, it may be be worth valuing the programmer’s time and sticking with a simple recursive algorithm. Else, fortunately, using the accumulator idiom leads to an almost-as-simple linear algorithm.

\begin{verbatim} fun rev2 lst = let fun aux(lst,acc) = case lst of [] => acc | x::xs => aux(xs, x::acc) in aux(lst,[]) end \end{verbatim}

The key differences are (1) tail recursion and (2) we do only a constant amount of work for each recursive call because \verb|::| does not have to traverse either of its arguments.

\section{A Precise Definition of Tail Position}

While most people rely on intuition for, ``which calls are tail calls,’’ we can be more precise by defining \emph{tail position} recursively and saying a call is a tail call if it is in tail position. The definition has one part for each kind of expression; here are several parts: \begin{itemize} \item In \texttt{fun f(x) = e}, \texttt{e} is in tail position. \item If an expression is not in tail position, then none of its subexpressions are in tail position. \item If \texttt{if e1 then e2 else e3} is in tail position, then \texttt{e2} and \texttt{e3} are in tail position (but not \texttt{e1}). (Case-expressions are similar.) \item If \texttt{let b1 … bn in e end} is in tail position, then \texttt{e} is in tail position (but no expressions in the bindings are). \item Function-call arguments are not in tail position. \item … \end{itemize}