Immutable objects are ones which cannot be changed once they have been created. While this is a simple idea, it is a useful tool for making your program easier to reason about. This post will discuss some of the benefits of immutable objects — or more generally, immutable data — as well as how several different languages can help to make it easier to work with such data. In particular, I will look at Java as well as Kotlin and Rust which are two relatively new languages that I have been experimenting with recently.

Benefits of Immutable Data

Simplified Reasoning

Consider the following example (in Java) where Point is a class representing a coordinate on a 2-d grid.

public int computeResult(Point p) {
    int result = 0;
    for (int i = 0; i < 100; i++) {
        result += i * p.computeSomething();
    }
    return result;
}

If, after some benchmarking, you find that the repeated calls to computeSomething() takes up a large portion of the program execution time, you may be tempted to extract the computation out of the loop.

public int computeResult(Point p) {
    int something = p.computeSomething()
    int result = 0;
    for (int i = 0; i < 100; i++) {
        result += i * something;
    }
    return result;
}

Assuming that the Point class is immutable and that the result of computeSomething() is only dependent on the value of p, that is a perfectly valid optimization.

However, if Point is mutable and computeSomething() modifies p, the final result may be different since computeSomething() might return a different value each time. In this simple example, this error would probably be caught by even a basic test suite. However, if you are using a more complicated object that gets passed between many methods, it may be difficult to write unit tests that will cover all the ways that the object might mutate. Furthermore, even a simple method may have unexpected results in a multi-threaded environment, such as the example below.

Point p = new Point(1, 2);

new Thread(() -> {
    computeResult(p);
}).start();

new Thread(() -> {
    p.setX(3);
}).start();

In this example computeResult() might return different results depending on which version of the method we use even if computeSomething() does not modify the point, since it is modified externally in a different thread. In addition, the result may change each time this snippet is run depending on how the threads are scheduled and when setX() occurs.

Using immutable objects eliminates these issues and makes it easier to reason about the expected outputs and behavior of your program. In the first example, it allows you to confidently assert that the output of p.computeSomething() will be consistent which makes the optimization possible. In the second example, the fact that the object cannot be modified means that it will not change unexpectedly on a separate thread.

Optimization

Immutability allows you to make additional assumptions about the object which can enable performance or memory optimizations. Extending from the Point example, let’s make it an interface instead.

public interface Point {
    int getX();
    int getY();
    // This represents any method which performs a potentially expensive
    // computation.
    double getDistanceFromOrigin();
}

A simple implementation might look like this.

public final class DefaultPoint implements Point {
    private final int x;
    private final int y;

    public DefaultPoint(int x, int y) {
        this.x = x;
        this.y = y;
    }

    @Override public int getX() {
        return this.x;
    }

    @Override public int getY() {
        return this.y;
    }

    @Override public double getDistanceFromOrigin() {
        double dx = this.x * this.x;
        double dy = this.y * this.y;
        return Math.sqrt(dx + dy);
    }
}

This object uses 8 bytes (4 bytes for each int, x and y; ignoring overhead which is present for all objects) and requires multiple floating-point operations to compute the distance from the origin.

If you make many calls to getDistanceFromOrigin() you may decide to add the following implementation.

public class ComputationOptimizedPoint implements Point {
    private final int x;
    private final int y;
    private final double distanceFromOrigin;

    public ComputationOptimizedPoint(int x, int y) {
        this.x = x;
        this.y = y;

        double dx = this.x * this.x;
        double dy = this.y * this.y;
        this.distanceFromOrigin = Math.sqrt(dx + dy);
    }

    @Override public int getX() {
        return this.x;
    }

    @Override public int getY() {
        return this.y;
    }

    @Override public double getDistanceFromOrigin() {
        return this.distanceFromOrigin;
    }
}

This implementation uses an additional 8 bytes (the size of a double) and adds cost to the constructor for computing the distance from the origin, but it does not require any floating-point operations when calling getDistanceFromOrigin().

We can also create optimized implementations for specific cases. For example, consider the following implementation of the origin point. It uses 0 bytes and requires no floating-point operations.

public class OriginPoint implements Point {
    @Override public int getX() {
        return 0;
    }

    @Override public int getY() {
        return 0;
    }

    @Override public double getDistanceFromOrigin() {
        return 0.0;
    }
}

8 bytes is not much and in this particular example you might not see much, if any, difference in practice. However, for more complex classes, this can result in much more drastic memory and performance improvements. A good example of this are the new immutable collections introduced in Java 9, accessible via the new List.of(), Set.of(), and Map.of() methods.

The most drastic difference is probably for a set containing a single element. ImmutableCollections.Set1 requires only enough memory for a single reference (again, I am ignoring standard object overhead, as well as the object contained in the set since that will be common between implementations). In contrast, a HashSet with a single element requires a HashMap, which in turn requires three ints, a float, a reference to a cached entrySet(), and most importantly a 16-element array, one element of which will be a HashSet.Node, which contains an int and three object references¹. I won’t compare performance because it is significantly more complex to analyze and would not provide much value to this post. However, given the fact that Set1 is implemented in only ~40 lines of code as opposed to HashSet and HashMap which are nearly 4000 lines combined, it is reasonable to assume that it Set1 performs fewer operations and would likely perform better.

Caching

Another optimization that you can perform is to cache and reuse objects. Continuing with the Point example again, if you have a large number of Points and multiple instances which represent the same coordinate, you could instead create a cache and re-use the same instance, as in the example below. This reduces memory usage (there is some overhead in order to maintain the cache, so you should consider whether or not this will actually benefit your use case) and can improve performance by eliminating potentially expensive constructor calls.

public class Point {
    private static final Map<Long, Point> CACHE = new HashSet<>();

    private final int x;
    private final int y;

    // Make the constructor private, forcing the use of the factory method.
    private Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    public static Point createPoint(int x, int y) {
        // Compute a unique key for retrieving Points from the cache.
        Long key = (long) x << 32 | (long) y;
        return Point.CACHE.computeIfAbsent(key, k -> new Point(x, y));
    }
}

Limitations of Immutable Data

Although I would recommend using immutable objects in most cases, they are not a universal solution and as with nearly all things in software engineering there are tradeoffs that you should consider.

In particular, immutable objects can be restricting. Immutability can limit the behavior of objects and it may be difficult or impossible to do certain things efficiently with immutable objects. For example, suppose we use a Point to represent the position of a character in a game. You’ll probably want your character to move, which means the Point will need to change. If Point is immutable, that means creating a new instance every time the character moves. Furthermore, if the character is also immutable, that means you must also create a new character instance each time. In addition to the overhead of repeatedly creating objects, this can create a lot of garbage and put pressure on the garbage collector, which will have a negative impact on performance.

Implementing Immutable Objects …

Now that we’ve gone over some of the benefits and use cases of immutable data, let’s discuss how to implement them in Java, Kotlin, and Rust. This is not meant to be a tutorial for any of these languages, so I will not be going into much detail. The point is simply to compare how a few different languages treat immutability.

… In Java

First we’ll start with a fairly standard Java (7+) implementation of an immutable Point.

// The `final` keyword is important here. If the class were not final, there
// could be a subclass that breaks immutability.
public final class Point {
    // Making the fields `final` is not strictly necessary as long as they are
    // private and you are careful not to modify the fields. However, doing so
    // allows the compiler help to ensure that you do not change primitives or
    // change object references. However, if one of the fields is an object,
    // the compiler cannot prevent you from mutating it.
    private final int x;
    private final int y;

    public Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    public int getX() {
        return x;
    }

    public int getY() {
        return y;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj) return true;
        if (obj instanceof Point) {
            Point other = (Point) obj;

            // I am using Objects.equals() to show a more general approach which
            // handles object fields which may be null. If the constructor(s)
            // disallow nulls, then Object.equals(Object) can be used instead.
            // For primitives, '==' can be used instead.
            return Objects.equals(this.x, other.x)
                    && Objects.equals(this.y, other.y);
        }
        return false;
    }

    @Override
    public int hashCode() {
        Objects.hash(x, y);
    }

    @Override
    public String toString() {
        return "(" + x + ", " + y + ")";
    }
}

Java is often criticized for being too verbose, although some argue that its verbosity and explicitness are actually advantages. That is not the point of this post, so I won’t say much on the topic. However, I do think that for classes which are essentially just data containers with little or no behavior, the above example is quite tedious. Particularly when considering that before Java 7 there was no Objects class which would make the implementation of equals() and hashCode() even more verbose.

The following examples will show how AutoValue, Immutables, and Project Lombok can be used to simplify the implementation of such objects in Java. Each example will try to approximate the plain Java example. One thing to note is that although each tool can generate a toString() methods, the format will differ, so in each case I override the default implementation to match the above example.

This is not meant to be a comparison of these tools, so I will only cover the most basic use case. Each of these tools have a number of other features, and while there is a lot of overlap, they each do things a little differently.

… With AutoValue

AutoValue is an annotation processor that automatically generates implementations of immutable objects. Below is the AutoValue version of a Point implementation.

@AutoValue
public abstract class Point {
    static Point create(int x, int y) {
        return new AutoValue_Point(x, y);
    }

    public abstract int getX();
    public abstract int getY();

    @Override
    public String toString() {
        return "(" + getX() + ", " + getY() + ")";
    }
}

This will automatically generate a concrete subclass (named AutoValue_Point in this case) of the abstract Point class using annotation processing; the generated class will be similar to the plain Java example.

In the create() factory method, you call the constructor of the generated class and you create instances through this factory method.

Point p = Point.create(1, 2);

… With Immutables

Immutables is another annotation processor similar to AutoValue. Below is the Immutables version of a Point.

@Value.Immutable
public abstract class Point
{
    @Value.Parameter
    public abstract int getX();
    @Value.Parameter
    public abstract int getY();

    @Override
    public String toString() {
        return "(" + getX() + ", " + getY() + ")";
    }
}

Like AutoValue, it automatically generates an immutable implementation (named ImmutablePoint in this case) and we are able to override the default implementation if we choose. Unlike AutoValue, it creates a builder by default (you must opt in to a builder implementation with AutoValue) and only generates the of() factory method if you use the @Value.Parameter annotation.

Point p1 = ImmutablePoint.of(1, 2);
Point p2 = ImmutablePoint.builder()
        .x(1)
        .y(2)
        .build();

… With Project Lombok

On the surface Project Lombok looks very similar to AutoValue and Immutables. However, behind the scenes, it is quite different — more on this after the example. Below is a Point implementation using Project Lombok.

@Value
public class Point {
    private final int x;
    private final int y;

    @Override
    public String toString() {
        return "(" + x + ", " + y + ")";
    }
}

In contrast to AutoValue and Immutables, which create subclasses, Project Lombok will enhance the Point class bytecode with the appropriate methods. This has the advantage of making its usage identical to that of the plain Java class.

Point p = new Point(1, 2);

However, it is not possible to do this with standard Java annotation processing, so Lombok relies on internal compiler APIs. Since it is using non-public APIs, it could technically break at any time if the compiler internals changed and in fact this is essentially what happened in Java 9. In Java 9, internal APIs were encapsulated which means that Lombok cannot run without specifying a number of compiler flags to explicitly allow access to those APIs. Refer to this issue tracking Java 9 support for more information.

… In Kotlin

Kotlin is a relatively new language which originally targeted the JVM, but now support compilation to JavaScript and native platforms. I like to think of Kotlin as what Java would become if the language designers were able to redesign Java from scratch using everything learned over the past 20+ years and without worrying about backwards compatibility. Obviously this is not really how Kotlin was designed, but along those lines, one thing they might add is the ability to more easily create data container classes without relying on annotation processing or compiler hacks. This is exactly what Kotlin does with its “data classes.”

// Classes in Kotlin are final by default and must be explicitly declared
// `open`. Data classes cannot be `open`.
// In Kotlin, you define a primary constructor in the class header.
data class Point(val x: Int, val y: Int) {
    // Like the Java annotation processors, Kotlin provides a `toString()`
    // implementation, but we override it.
    // (If we did not need this, the entire class definition would just be the
    // first line)
    override fun toString(): String {
        // Kotlin supports string interpolation.
        return "($x, $y)"
    }
}

Without looking at the generated bytecode (assuming you are targeting the JVM), which I have not done, it is hard to make exact comparisons to Java. However, this should be very similar to the original plain Java example, with a few other additional methods required for some other Kotlin features.

The generated class can be seamlessly used from either Java or Kotlin (and probably other JVM languages as well, but I don’t have any experience with that).

Point p = new Point(1, 2);

// Kotlin does not use the `new` keyword for constructors.
val p1: Point = Point(1, 2)
// The type of `p` can be omitted since it can be inferred by the compiler.
val p2: Point(1, 2)

… In Rust

Rust is another relatively new language. There are not many similarities to Java or Kotlin so I won’t try to make any comparisons. However, I think it is still very relevant to this discussion because of its very different approach to immutability. I am still very much a novice at Rust, so in order to avoid getting too much wrong, I will not be going into too much detail.

First, let’s start with a Rust Point implementation. The comments should briefly explain what is going on.

// Create a structure of related data.
// Rust does not automatically generate hash code and equality methods, but
// they can often be automatically derived.
#[derive(PartialEq, Eq, Hash)]
struct Point {
    x: i32,
    y: i32,
}

// Start an implementation block where we define methods on the struct.
impl Point {
    // Rust does not have constructors, but a common pattern is to provide a
    // `new` function to initialize struct instances.
    fn new(x: i32, y: i32) -> Point {
        Point { x: x, y: y }
    }

    // A `self` parameter indicates that it is a method that operates on an
    // instance of the struct. If it does not have a `self` parameter (as with
    // the `new` function above) then it is an "associated function" and is
    // approximately analogous to a static method in Java.
    fn get_x(&self) -> i32 {
        self.x
    }

    fn get_y(&self) -> i32 {
        self.y
    }

    // The `mut` keyword means that we can mutate the struct.
    // More on this later.
    fn set_x(&mut self, x: i32) {
        self.x = x
    }

    fn set_y(&mut self, y: i32) {
        self.y = y
    }
}

// This implements the std::fmt::Display "trait", which defines the "fmt"
// function. This is approximately equivalent to implementing toString() in
// Java/Kotlin.
impl std::fmt::Display for Point {
    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
        write!(f, "({}, {})", self.x, self.y)
    }
}

The first thing to note is that I’ve included methods to change the Point’s data. In most languages this would mean that the type is not immutable. However, Rust determines mutability per instance rather than per type. This means that we can have two instances of the same type, one of which is mutable while the other is immutable.

// Variables are immutable by default.
let p1 = Point::new(1, 2);

// You must explicitly declare a variable as mutable using the `mut` keyword.
let mut p2 = Point::new(3, 4);

// Compile error because `p1` is not mutable.
p1.set_x(5);

// This is fine since `p2` is mutable.
p2.set_x(6);

This is only a small portion of the guarantees that Rust provides about mutability, but I don’t want this to turn into a Rust tutorial so won’t go into any more detail². The point of this section was simply to demonstrate that Rust treats mutability very differently than many other languages. It treats it as a fundamental part of the language and makes immutability the default rather than requiring the programmer to opt in to immutability.

Conclusion

Immutability is a simple yet powerful tool allowing you to write simpler and more efficient code. We briefly explored three different languages and showed how some more modern languages such as Kotlin and Rust embrace and encourage immutability. Kotlin does this by providing features such as data classes to make it easier to create immutable classes. Rust does this by making mutability a fundamental part of the language and requiring you to explicitly declare that a value is mutable.

Although Java does not make it as easy to work with immutable data as Kotlin or Rust, there are a number of third-party tools that can help. Even if you were not convinced by my arguments, the fact that there are at least three such tools in popular usage should give an impression of the value of immutable data.

This is based on the current OpenJDK implementation of HashSet and HashMap. The exact memory usage may differ in different or future implementations and I may have gotten some of the details wrong about the current implementation, but the fact remains that there is a drastic difference between a general set implementation and a specialized implementation that only needs to handle one specific case. ↩
For more information, check out the rust book, particularly the chapter on ownership which is fundamental part of the language that determines how data can be accessed and mutated. ↩