Java 8 – List to Map – Concepts, Gotchas and Best practices

Please share if you found this useful!!!
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  

Introduction

We looked at basic collections based transformations with Java 8 and their benefits in one of the earlier blog post in this series.

The intent of this blog is to continue on that journey, and look at one of the other most common and useful transformation usage patterns when working with collections – transforming a List (or a Set) to a Map and compare it with respective pre-Java 8 implementation. The blog also tries to simplify the process of understanding usage of Java 8 constructs, thereby help you adopt better coding practices and write simple, error free and maintainable code.

If you are here to get a quick sample code, following are some examples of various ways through which one can easily transform a List (or a Collection) to a Map using Java 8 constructs. For an in depth understanding of each one of them, their pros and cons and to understand which one to use when, please continue reading through the complete blog post.

Lets dig deeper and simplify our code

Here is an example of the most commonly used way of converting a list to a map using pre-Java 8 code. The below code transforms a list of Employee objects into a Map of Employee objects, with key as the name of the employee and value as the employee object itself.

Variant 1 – the simplest toMap

And here is its equivalent in Java 8, using Java 8 constructs like Streams and Collectors.

In terms of understanding what is going on here, the above code uses Java 8 streams and Lambda support to

  • (internally) iterate over the list
  • use a predefined map Collector implemented in the Collectors helper class to collect the results in a map with the given key (employee name in this case) and the element itself as value.

An earlier blog post in this series has some detailed explanation about streams, collectors and the benefits of this approach. But generally, it can be easily observed that the Java 8 version certainly seems more superior with benefits like internal iteration, immutability, conciseness, clarity of code, and declarative code. However, there are many more benefits…lets try to understand those as well.

Failing fast is better!!

Lets try to test our code with some test scenarios. It certainly works great for the simple test case where all employees have unique names. Instead, let us try to test it with the case when we have four Employee objects in the List, with two of them having the same name.

Employee (Id=1, Name=”Robert”, DOB=01-March-1970)

Employee (Id=2, Name=”John”, DOB=10-April-1980)

Employee (Id=3, Name=”Robert”, DOB=10-April-1985)

Employee (Id=4, Name=”Michael”, DOB=10-April-1972)

Our pre-Java 8 implementation above returns a Map that contains three entries for this test case, while the Java 8 version above results in an exception. Obviously, not something we were expecting.

Lets analyse – we would expect the resultant map to contain four entries corresponding to our four employees in the source list. However, a careful observation would reveal that our pre-Java 8 code does not account for the case when employees have the same name. Since our pre-Java 8 implementation does not account for this case, it inadvertently (and silently) ignores this fact and overwrites the map entry for “Robert”. Thus we are missing the entry for employee with id=1.

The Java 8 implementation we are using accounts for this condition and throws an exception instead of silently ignoring the problem. So, the Java 8 implementation lets the code fail fast instead of silently hiding away a hard to find bug that is data dependent.

Key difference in fail-fast behavior

A point to note is that a good unit test that tests for this condition will make the above pre-Java 8 code snippet fail the test as well, but the unit test has to assert for that condition. If there aren’t any unit tests or the unit tests aren’t comprehensive enough to assert for all conditions, then the pre-Java 8 code snippet above would silently keep working incorrectly in production. However, the Java 8 code snippet would fail with an exception in such a situation.

Lets fix the above problem with our next Variant.

Variant 2 – groupingBygroup on a key

Alright, so how to fix this problem – in this particular example, since our original employees list does contain all unique Employee objects (distinct employees with same name), the resultant map needs to contain all those employees – thus, the structure of the resultant map would have to be changed. Since multiple employees can have the same name, we need to collect Employee objects with same name in a collection, may be a list. Thus, our resultant structure should be Map<String, List<Employee>> instead of Map<String, Employee>.

So here goes the pre-Java 8 implementation…

 

The above code certainly looks a bit involved, complex and hence much more error prone. For example, one has to instantiate a new map to store the result, and while iterating through each employee, check if the employee with same name already exists in the result map, if it does not, allocate a new List to store the employees with this given name, and store the list against the employee name in the result map, likewise, if the result map already contains an entry for given employee name, add the current employee to the corresponding List of employees.

So, there are multiple points where a bug can get introduced inadvertently or due to programming or logic errors. Lets look at the Java 8 implementation for the same using streams and Collectors.

Again, the code looks very similar to the earlier Java 8 code snippet, we are using streams to iterate the list and collect the result in a Collector (through the Collectors utility class). The only difference is the collector being used – in this case, we are using a collector that groups the results on a give key, instead of just using a simple map collector, which is very logical because we are grouping employees by name. As is evident, the Java 8 streams API not only makes the code concise and readable, it reduces the potential of developers making logic errors while performing mundane tasks like transforming a collection into another collection.
It obviously cannot get easier than this

Less code to write, less bugs, less unit tests…!!

One is simplest way to reduce bugs in the code is to not write code :), or write as less as possible!

In this particular scenario, it is clear that there is hardly any chance of introducing a bug using the Java 8 code snippet above, as compared to pre-Java 8 version, which had a lot of potential for programmers to introduce bugs.

Also, another point to note in this particular example is eliminating the need to write a unit test for this specific condition. The pre-Java 8 code snippet above requires you to write a unit test to test if all in the original list are accounted for, while, with the Java 8 code snippet above, you get that functionality for free from JDK, so you do not need to write a specific unit test to test it.

Variant 3 – toMap – manage merge-conflict-resolution

So, we looked at the case when we would like to group our objects because they have common keys although the objects themselves were unique, however, imagine, that we have genuine duplicates in the original list. For example, the original List contains duplicate Employee objects (employee objects with same Id), and we would like to transform that List<Employee> into a Map<String, Employee> where the map key is employee id.

Obviously, as explained above, the simple vanilla version of Java 8 Collectors.toMap would result into an exception because of duplicate key. So, the Collectors framework provides another variant of toMap which lets us resolve the duplication problem (or merge conflict). Here is an example.

The above code snippet can be imagined as someone iterating over the input collection, and adding one element at a time to the map, checking if the value already exists, if it does exist in the map (duplicate), we compare the two duplicate employee instances and perform a merge conflict, which in this case is to pick the second one (and get rid of the duplicate one).

The above example illustrates the use of another variant of toMap which lets us resolve the merge conflict by choosing the right object instance from the two duplicate instances being compared. So in this particular example, if the input collection is ordered and is sorted by update timestamp, and employee object instance that comes later in the iteration (having the latest update timestamp) would finally make it to the map at the end of the transformation and all other previously occurring duplicates would get ignored. Obviously, the logic to resolve the merge conflict can be as simple as illustrated above, or can be complex, in which case, it can written as another lambda expression or better as a function call as illustrated in the below code snippet.

Other Variants

We looked at three variants above, which depicted significantly different functional behavior. There are many more which add some more flavors to these. Let me briefly cover those as well.

Variant 4 – toMap with a supplier

As shown above, you could provide your own custom Map instance that will be used by the collector to collect elements. This could be useful for specific use cases, like maintain a sorted map or an ordered map, for instance, using a TreeMap or LinkedHashMap or any other custom Map implementation.

More Variants – toConcurrentMap – Performance with parallel streams

All three variants of toMap covered earlier work with parallel streams as well, but there are corresponding toConcurrentMap variants for better performance with parallel streams, because the toMap variants perform a combine operation to merge keys from one map to another in case of parallel streams.

How does it all work!!

Alright, we have seen various readily available map collector implementations that are available the in Collectors helper class in JDK, however one question that arises is how does it all work. Lets try to understand that at a high level. We will try to use the signature of one of the toMap utility methods within Collectors helper class to understand this, as shown below.

As you may note, the method takes four parameters.

  • keyMapper – a lambda function to resolve an input element into a key
  • valueMapper – a lambda function to resolve an input element into a key
  • mergeFunction – a lambda function to encapsulate the merge conflict
  • mapSupplier – a lambda function to provide the resultant map instance

The key and value mappers are lambda functions that take a single input (element in the collection) and return a single output (the key or the value). Thus, their declaration uses the Function<T, R> functional interface where T is the input type and R is the output type.

Functionally, we want the merge conflict logic to take two elements of same type and return an element of the same type after resolving the merge conflict. Thus, the mergeFunction lambda function should take two input arguments and return an output, all of the same type. That is defined by BinaryOperator<T> functional interface, hence it is represented as the third argument to our toMap method above.

Likewise, the fourth argument is represented as another functional interface Provider<T> that lets you pass a lambda function that can encapsulate the logic to create an instance of a Map.

Final Thoughts

As illustrated through various examples above, the Collectors helper class introduced in Java 8 provides a number of useful inbuilt utility collectors that can easily transform a Collection into a Map with very little code. This approach of transformation not only has various benefits of using Java 8 streams like internal iteration, immutability, clear and concise code, but it also helps programmers write error free code that is much easier to maintain and run in production.

Although the syntax may seem a bit complex initially, once familiar and understood, it helps us write much better code, abstracting the mundane transformation logic away from the programmer, and in the process making the code clean, maintainable and bug free.

Hope you enjoyed reading the post and learnt something new to try!!!


Please share if you found this useful!!!
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  

Leave a Reply

Your email address will not be published. Required fields are marked *