Java Collections Interview Questions with Detailed Answers

None of the interviews for Java based position can be completed without questions about Java Collections Framework. The Java Collection Framework is one of the core framework of Java language which each Java developer should know. The Java Collection Framework itself is very wide and its difficult to cover all the areas but still in this blog we will try to cover those questions and answers in detail which a candidate should know.

Hierarchy of collection framework:

Java collections hierarchy

Q 1. What is the difference between an ArrayList and Vector ?

A. ArrayList and Vector both uses Array as underlying linear data structure internally. Still there are differences in the manner they internally store and process the stored data.

1) Synchronization: ArrayList is non-synchronized, which means its unsafe to work with ArrayList in a Multithreaded environment because all threads can concurrently access and modify the contents of the ArrayList. Vector is synchronized where multiple threads cannot access the contents of a Vector concurrently. In simple words if one thread is accessing a Vector object, no other thread can get access to it. Unlike ArrayList, only one thread can perform an operation on vector at a time.

2) Growth on overflow: In case of overflow ArrayList grows by half of its size and Vector grows by double of its size. In case of ArrayList if current size of an ArrayList is 10 then on insertion of 11th element its new size would be 15 and in case of Vector new size would be 20.

3) Performance: Since ArrayList is non-synchronized hence its performance will be better than Vector. On performing operations on ArrayList, Threads need not to acquire/release locks which makes if performance better than Vector.

4) Fail-fast and Fail-safe: To traverse over ArrayList an Iterator is needed which itself is fail-fast. Fail-fast means that if the underlying collection (ArrayList in this case) is structurally modified during the traversal the Iterator would fail and throw ConcurrentModificationException. Vectors can traversed using iterator and Enumerator both where traversal through Enumerator is fail-safe. If a Vector is traversed using Enumeration then it can be structurally modified.

Q 2. Why Collection interface does not extend Cloneable and Serializable interface ?

A. The JDK does not provide any direct implementations of Collection interface. The Collection interface is further extended by more specific interfaces like List and Set. So it is not necessary that every concrete implementation class should provide Cloneable and Serializable behavior. There may be cases where the Collection is very large and the application won't allow cloning of the Collections because cloning is a heavy operation. Hence its left upto the users if they need such data structure which should be Cloneable then they should pick proper Collection implementation like ArrayList and where they don't need Cloneable behavior there they can pick those implementation which do not support like ArrayBlockingQueue.

Q 3. What is the difference between a HashMap and ConcurrentHashMap ?

A. Both HashMap and ConcurrentHashMap are equivalent to HashTable implementation except their concurrency support. Essentially a HashMap is not thread safe and a ConcurrentHashMap is thread safe. Even though all operations of ConcurrentHashMap are thread-safe, retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access. No locking is at all supported by HashMap but in case of ConcurrentHashMap only that portion is locked which is being used by current Thread.

Q 4. What is a synchronized HashMap ?

A. By default the HashMap is non-synchronized and is not thread safe. We can make it thread safe by passing its reference to a static method of Collections class. That static method is synchronizedMap(Map). This method returns a (new instance of SynchronizedMap class which is a private static inner class of Collections class) synchronized (thread-safe) map which is backed by the specified map. In order to guarantee a thread-safe behavior on the passed HashMap, it is necessary that all access to the backing map is accomplished through the returned map. Though by calling this method the Map itself becomes thread safe and allows serial access to its reference, but it won't guarantee the thread-safe behavior for iteration over any of its collection views. Which means that the user should manually synchronize on the returned map using synchronized block. Consider this example:

Map threadSafeMap = Collections.synchronizedMap(new HashMap());
   ....
   ....
Set nonThreadSafekeySet = threadSafeMap.keySet();
   ....
   ....
synchronized (threadSafeMap) {
  Iterator iter = nonThreadSafekeySet.iterator(); // Must be in a synchronized block
  while (iter.hasNext())
    processIt(iter.next());
  }

Here the map becomes thread safe but its collection view (keySet()) is still not thread-safe so we should put the iteration code in a synchronized block.

Q 5. What is the difference between ConcurrentHashMap and a HashMap which is made synchronized by Collections.synchronizedMap(Map)

A. 1. Concurrency: ConcurrentHashMap and Collections.synchronizedMap() both provide thread-safe operations on their data. One major difference in their operation is that the lock is acquired on whole object of HashMap but in case of ConcurrentHashMap the lock is acquired only on that portion of Map which is currently being accessed by Thread. Which essentially means that other parts of ConcurrentHashMap can still be used by other running threads.

2. Performance: Collections.synchronizedMap will not perform better than ConcurrentHashMap because if former case the lock is acquired on the whole object so all other threads should have to wait to acquire the lock until the current thread completes. In case of ConcurrentHashMap multiple threads can access multiple portions of its object so execution will be faster.

3. Null-keys: ConcurrentHashMap won't allow null keys and values but HashMap allows 1 null key and multiple null values.

4. Concurrent Modification: ConcurrentHashMap gurantees that it won't throw ConcurrentModificationException while one thread is updating the map and other thread is traversing the iterator obtained from the map. However, Collections.synchronizedMap() will not guaranteed on this. If we obtain an Iterator from Collections.synchronizedMap() by calling map.keySet().iterator() and then traverse the iterator, at the same time if another thread is trying to updating the map by calling map.put(K,V), we will get a ConcurrentModificationException. Consider following example:

//It will throw ConcurrentModificationException
Map<String, String> map = Collections.synchronizedMap(new TreeMap<String, String>());
//It won't throw ConcurrentModificationException
//Map<String, String> map = new ConcurrentHashMap<String, String>();

map.put("k1","v1");
map.put("k2","v2");
map.put("k3","v3");
map.put("k4","v4");
map.put("k5","v5");

Set<Map.Entry<String,String>> entries = map.entrySet();
Iterator<Map.Entry<String,String>> iter = entries.iterator();

while(iter.hasNext()){
      System.out.println(iter.next()); 
      map.remove("k2");//Will throw ConcurrentModificationException in case of Collections.synchronizedMap
    }
}

5. Insertion Order: ConcurrentHashMap will not preserve the insertion order of objects but Collections.synchronizedMap() will maintain the order of the passed collection.

Q 6. What is the difference between HashTable and Collections.synchronizedMap(Map)
A. They both are same in terms of concurrent access and performance. Both will lock the entire Collection if one thread gains access to them. The only major difference is that Hashtable have synchronized methods, whereas synchronized HashMap use synchronized blocks inside the methods. One more difference can be that HashTable is a legacy type but Collections.synchronizedMap can make any Map reference a thread safe Map.

Q 7. What are the differences between a Synchronized Collection and a Concurrent Collection ?
A. As their name suggest both of the Collection are able to deal with multiple threads at the same time and provide Thread safety. The main difference between these two collection types is performance in terms of how they achieve Thread safety and scalability.
A collection can be made synchronized by using appropriate static method from Collections.java class like Collections.synchronizedList(), Collections.synchronizedSet(), Collections.synchronizedMap(). The collections who can be made synchronized like ArrayList, HashMap and HashSet are slower than their concurrent versions CopyOnWriteArrayList, ConcurrentHashMap and CopyOnWriteHashSet. Reason of slowness is because synchronized collections lock the whole object when a thread is accessing any portion of that collection. But in case of Concurrent Collections, they are smart enough and, they do not lock the whole object but locks only that portion of object which is currently being accessed by a Thread. Other parts of the concurrent collection are still available for use for other threads.

For example in ConcurrentHashMap the basic strategy is to subdivide the table among Segments, each of which itself is a concurrently readable hash table. Due to this multiple threads can access multiple segments individually at the same time which will increase the performance and reduced wait time.

Another example is CopyOnWriteArrayList which is a thread-safe version of ArrayList. The name CopyOnWrite itself says, make a copy of the underlying list each time when a write operation is done like add or remove an element. In a CopyOnWriteArrayList all the operations which can change the structure/state of the backed ArrayList like add, set, remove are implemented by making a fresh copy of the backed ArrayList.

When CopyOnWriteArrayList is iterated using Iterator interface, the iterator takes a snapshot of the underlying list and does not reflect any changes (made from concurrently accessing threads) to the underlying list after this snapshot was created. This is the reason that the iterator of this list will never throw a ConcurrentModificationException.
So CopyOnWriteArrayList is a special data structure which should be used in those cases for where multiple threads can access the list with more reads and less writes.

Q 8. What is difference between fail-fast and fail-safe Iterators ?
A. Fail-Fast Iterator: As the name suggests the fail-fast iterators are those iterators will immediately fail the iteration if the underlying collection is modified (by adding or removing elements from the collection from other threads rather than Iterator's own methods) during the iteration. Fail-fast iterators throw ConcurrentModificationException if collection is modified during iteration. Example of fail-fast iterators are the iterators retrieved from ArrayList, HashMap, HashSet etc.

Iterators maintains a flag named "mods". Iterator checks the value of "mods" whenever it gets the next value using hasNext() method and next() method. The value of this flag also changes whenever there is an structural modification, so on next iteration the Iterator would know that the flag "mods" is not in sync. Hence iterator knows when to throw ConcurrentModificationException.

Fail-Safe Iterator: Fail-safe iterators are those iterators who won't fail if the underlying collection is modified during the iteration. Internally a Fail Safe Iterator makes a copy of underlying data structure and the iteration is done over the copy of the object. Any structural modifications to underlying data structure affects the copy of the data structure. This way the original data structure remains structurally unchanged. This is the reason why fail-safe iterators won't throw ConcurrentModificationException. Apart from the concurrent modification feature there are few issues with these iterators:

a. It is not guaranteed that the data being read in the iteration is actually present in the original collection.

b. Additional overhead to maintain the copy.

According to Oracle docs, fail safe iterator is too costly, but may be more efficient than alternatives when traversal operations vastly outnumber mutations, and is useful when you cannot or don’t want to synchronize traversals, yet need to preclude interference among concurrent threads. The "snapshot" style iterator method uses a reference to the state of the array at the point that the iterator was created. This array never changes during the lifetime of the iterator, so interference is impossible and the iterator is guaranteed not to throw ConcurrentModificationException. The iterator will not reflect additions, removals, or changes to the list since the iterator was created. Element-changing operations on iterators themselves (remove(), set(), and add()) are not supported. Invocation of these methods will throw UnsupportedOperationException.

Q 9. Explain the implementation of HashSet in Java. How does it uses Hashing functions?
A. We all know that all the Set implementation classes allows only unique elements inside them. If you check the source code of HashSet.java class in Java source code then you can see the following code snipped for add()
private transient HashMap map; // Dummy value to associate with an Object in the backing Map private static final Object PRESENT = new Object(); ...... ... public boolean add(E e) { return map.put(e, PRESENT)==null; }

If you observe the implementation of add() above you can see that it keeps the incoming element in a map which is actually transient instance of HashMap. So internally HashSet is backed with a HashMap and the values of HashSet are stored as keys of HashMap. Since HashMap keys are unique hence it also assures that HashSet elements are also unique. The PRESENT is an instance of Object class to satisfy the HashMap's put method.

So, when we add an element in HashSet by calling add("abcd") method of HashSet then internally HashSet will put that element E ("abcd") as a key in the HashMap (which is actually initialized during HashSet object creation) and a Object class's object is passed as a value to this key.

Now other question is about using Hashing functions. As per the above answer it is now sure that HashSet do not control the state of the object but it is controlled by a HashMap. So as soon as we add any element in the HashSet, the put() method from HashMap is called to store the element. Now its HashMap's responsibility to assure that the passed Element can be stored as its Key and it remains a unique element in the entire life span of HashSet.

Whenever we try to store an element in the HashSet, internally the HashMap calls the hashCode() and equals() of the passed element and makes sure that the passed element is unique and no other element (key) exists in the HashMap with same hashCode() and equals().

Q 10. What should you do to use a custom object in Hash based data structure classes like Map or Set?
A. If we want to store custom object in Hash based data structures then we need to make sure we give a proper implementation for these methods in the Custom object:
1) equals(Object o) : The equals() is from Object class and available in each and every class of Java. In case of Hash based data structures we need to override it in such a way that if the passed object and the object on which equals() is called posses same data inside. For example lets consider following Employee class:
public class Employee { private String name; private Integer age; private String hobby; }

Then if we create two instances of Employee object with same name and age then the equals method should return true. So we should write equals() this way
public boolean equals(Object o) { if (o == null) { return false; } if (o == this) { return true; } if (!(o instanceof Employee)) { return false; } Employee emp = (Employee) o; if (this.name.equals(emp.getName()) && this.age == emp.getAge()) { return true; } return false; }

2) hashCode() : The hashCode() method returns a integer. The hashCode() should be properly implemented else improper and incorrect implementation of hashCode() would lead to hashcode collisions and hamper the performance of the data structure. The contract of hashCode() is if two objects are equal by equals() method then their hashCodes should also be same. From java docs
Objects that are equal must have the same hash code within a running process

With above definition of hashcode and equals consider the following statements :

Unequal objects must have different hash codes – NO
Objects with the same hash code must be equal – NO

This is the reason hashCode method should be implemented using prime numbers. And also we should make sure that we include all those properties in hashCode calculation which were used in equals() calculation (name and age in above example). Consider the following code to generate hashCode:
public int hashCode() { final int prime = 31; final int name = name == null ? 0 : name.hashCode(); final int age = age == 0 ? 0 : age.hashCode(); return prime * (prime + name + age); }

All of these hash based datastructures first calls the hashCode() to locate the bucket and then calls the equals() on all the objects found in that bucket. The ideal scenario is when all the objects in a bucket are "equal". When multiple "equal" objects are found in the same bucket then the first object is returned. This is the reason that for "equal" objects the "hashCode" should also be same so that all "equal" objects go under same bucket and searching of object is faster.

Q 11. Where ConcurrentHashMap can be used in Java ?
A. ConcurrentHashMap can be safely used in multithreaded environment because it allows multiple access to the elements of the underlying Map. It performs well when multiple threads accesses its elements. Also if it is shared between threads and if a thread is iterating over it and other thread updates its state then the iterator won't fail and the thread can complete the iteration without knowledge of the state changes. This behavior of Iterators is called fail-safe behavior.

Q 12. What are the ways to Sort objects in a Collection ?
If the Collection contains elements of primitive wrappers like Integer, Long and exceptionally String then the collection can be sorted using Collections.sort(). Using sort() from Collections class the underlying collection would be sorted according to the natural sorting order in ascending format. But if the underlying collection contains Custom Objects like Employee object then there are 2 ways to sort this collection:

1) Using Comparator Interface
2. Using Comparable Interface

Q 13. What is difference between HashMap and HashSet?

Differences

HashSet	HashMap
HashSet is the implementation of Set interface	HashMap is the implementation of Map interface
HashSet stores objects/elements directly without duplication of elements. Its representation can be as {"hello", "how", "are", "you"}. If you try to store a duplicate element then it will be ignored by HashSet.	HashMap stores elements in Key-Value pairs. Assuming a HashMap with Integer as key and String as its value then its representation can be as {1 ->"Hello", 2-> "how", 3->"are", 4-> "you"}
HashSet does not allow duplicate elements that means you can not store duplicate values in HashSet.	HashMap does not allow duplicate keys however it allows to have duplicate values.
HashSet permits a single null value.	HashMap permits single null key and any number of null values.

Similarities

Feature	HashSet	HashMap
Synchronization	No.	No.
Can be made synchronized.	Yes. Set s = Collections.synchronizedSet(new HashSet(...));	Yes. Map m = Collections.synchronizedMap(new HashMap(...));
Order of elements	Insertion order not guranteed.	Insertion order not guranteed.
Time complexity	Average Case O(1) when hashCode() is properly implemented. Worst case O(n).	Average Case O(1) when hashCode() is properly implemented. Worst case O(n).

Q 14. Can we replace Hashtable with ConcurrentHashMap ?
No. ConcurrentHashMap was the replacement of HashTable and Collections.synchronizedMap() which was added in concurrent package in java 1.5. So we cannot replace the usage of ConcurrentHashMap with a HashTable.

Comments

Technologypartner.inApril 23, 2015 at 11:23 AM
Thank you for this great blog post, Java is an object-oriented programming language. The Java platform differs from most other platforms in that it's a software-only platform that runs on top of other hardware-based platforms. Here you can find online java tutorial for beginners with example and online Java Test.
BabladMay 15, 2015 at 6:59 PM
Nice collection of questions....

Search This Blog

Java Code Geek

MSSQL server audit