March 2011: Immutability in Clojure - Part 1, Programming Without Variables

Immutability in Clojure - Part 1, Programming Without Variables

By Steve Molitor, OCI Senior Software Engineer

MARCH 2011



Mutable
adjective 1. liable or subject to change 2. given to changing; constantly changing; fickle or inconstant: the mutable ways of fortune // Synonyms: capricious, flickery, fluctuating, mercurial, fickle, unpredictable, unstable

Sources: dictionary.com, merriam-webster.com


The Problem with Objects: Mutability

If you are a "programmer of a certain age," you will remember some of the initial hype surrounding OOP and objects: Objects would rescue us from the maze of procedural spaghetti code through encapsulation. They would provide us with reusable building blocks, easy to reason about and reuse – just like LEGO blocks or bricks in a house.

When building a house, you do not need to worry about the internal structure of your bricks: one brick is as good as another (assuming you have a reliable supplier). Objects would be like those bricks – easy to reason about, quick to assemble into larger composite structures, and easy to substitute one object for another, as long as the substitute observed the same contract.

Now I believe that objects have improved our code in many ways. But there is one major fly in the ointment: mutability.

In most OO languages, including Java, objects are mutable by default. This makes it much harder to reason about a program's behavior.

When you pass an object to a method, you have no idea what will happen – the method may change the object's properties, or some other object's properties, perhaps through a chain of nested calls.

When building a house, one does not expect one brick to reach out and change the properties of another brick, which triggers that brick to change something about the brick next to it, and so on. It would be impossible to build a house using magic, mutable bricks like those. But that is what we try to do when building systems out of mutable objects.

Through a combination of convention and discipline, it is possible to create immutable objects in Java, but there are no guarantees. And these approaches break down with collections, so immutability is used only for small-value objects.

Introducing Clojure

Clojure is a new functional language written specifically for the JVM.

Clojure is a variation of LISP that brings functional languages concepts to the JVM. Clojure also integrates well with Java code, so it can take advantage of the huge number of Java libraries.

Clojure provides many useful features, but in this article, we will look at just one: immutability.

Benefits of Immutability

In Clojure, immutability is the default. Once an object or data structure is created, it cannot be changed. This provides many benefits:

Understanding

As mentioned above, mutability makes it much harder to reason about and understand programs, because any object passed to a method could get changed under the covers.

In an immutable system, however, one only needs to understand the input and output values passed to a function to reason about its behavior.

Testability

Since the set of possible states and transitions is much smaller, an immutable system is easier to fully test.

Invariants

Since an object cannot change once constructed, invariants can be enforced when an object is constructed and never need to be checked again.

Once you validate that an object is constructed in a valid state, you never need to validate it again – it can never become invalid.

Equality

Testing two mutable objects for equality is illusory, since, while they might be equal now, they could become unequal later if a property of one of the objects changes.

If two immutable objects are equal, they will always be equal.

Cheap Sharing

In Java, one often makes defensive copies of objects to prevent errors when a shared object reference is changed.

In Clojure, object references can be safely shared, since they can never change. We can more aggressively intern objects and implement memory-saving techniques, like the flyweight pattern.

Concurrency

Immutable objects are thread-safe by definition. If an object cannot be changed, it can be shared among different concurrent threads without concurrent modification exceptions.

Look Ma, No Variables!

While there are many benefits to pervasive immutability, it does require different programming styles. It takes some getting used to.

This article will examine one of the first humps you have to get over when programming in a functional language like Clojure – the absence of true variables.

In Clojure, since everything is immutable, there are no variables.

Actually the term 'variable' is sometimes used, but you can't vary their values. So they are more like constants. It's as if you had to declare all of your variables 'final' in Java.

But how does one program without variables? How can we do simple things like looping without variables? Consider the following Java method:

           public void greetingsFromSanta(int n) {
                for (int i = 0; i < n; ++i) {
                    System.out.print("Ho! ");
                }
            }

If we called this method as greetingsFromSanta(3) it would print "Ho! Ho! Ho!"

As we know, the for loop form in Java is syntactic sugar for something like this:

          public void greetingsFromSanta() {
                int i = 0;
                while (i < n) {
                    System.out.print("Ho! ");
                    i = i + 1;
                }
            }

The i variable is incremented and then assigned back to itself. The value of i is not constant; it's variable.

How could we write such a method without incrementing the i variable?

Traditionally in functional programming languages such as Scheme and Lisp, this sort of thing is done using recursion. We could write a similar function in Clojure as follows:

            (defn greetings-from-santa [n]
                (if (> n 0)
                    (do
                        (print "Ho! ")
                        (greetings-from-santa (dec n)))))

We define a function called greetings-from-santa that takes one parameter, n.

If n is greater than zero, we print "Ho! ", and recursively call greetings-from-santa again, passing it a new value for n that is one less than the current value. (The dec function decrements a value by one.)

One problem with recursion is that it can generate large call stacks.

If Santa suddenly got very talkative and passed large values for n, our little function could consume lots of stack memory. However, if the recursive call is the last expression evaluated in a function (the most common case), the compiler can unroll the function into a non-recursive loop that is more efficient. This is called 'tail call optimization'.

Unfortunately the JVM does not directly support automatic tail call optimization. To work around this, Clojure has a special method called recur that can be used to tell the Clojure compiler to perform tail call optimization.

The call to recur must be the last expression evaluated in the function; if it is not, the compiler will throw an UnsupportedOperationException.

We can make our function more efficient by using recur:

          (defn greetings-from-santa [n]
                (if (> n 0)
                    (do
                        (print "Ho! ")
                        (recur (dec n)))))

This tells the compiler to perform tail call optimization – in other words, to unroll the stack into a flat and efficient loop.

Of course, Clojure has looping functions built in.

We could write our function more simply using the dotimes function that comes with Clojure:

            (defn greetings-from-santa [n]
                (dotimes [i 3] (print "Ho! ")))

The dotimes macro takes two arguments, the argument name containing the current loop index (not used in this example) and the number of times to execute, followed by the body of code to execute.

Since we are not using it, it would be more idiomatic to use an underscore as the placeholder value for the loop counter:

            (defn greetings-from-santa [n]
                (dotimes [_ 3] (print "Ho! ")))

Traditionally, functional programming languages such as LISP and Scheme have used recursion extensively. However, Clojure provides something called 'lazy sequences' which provide an alternative to recursion in many cases.

Lazy sequences are like 'virtual collections,' in which each item is calculated by calling a function when the item is needed. However, lazy sequences are the subject of a future article!

Conclusion

We have just scratched the surface of immutable programming. We see how we can program without true variables, but we need to look at lazy sequences and Clojure's persistent collections.

Without mutability, collections cannot be modified. Instead, they must be copied, or at least appear to be copied. This might seem expensive, but Clojure's persistent collections provide an efficient mechanism for using immutable collections. This too will be the subject of a future article. Stay tuned!



Software Engineering Tech Trends (SETT) is a monthly newsletter featuring emerging trends in software engineering. 
 

Check out OTHER CLOJURE articles! 
secret