Learnitweb

Java Collection Framework – An Introduction

1. Introduction

The Java Collection Framework (JCF) is a unified architecture for storing and manipulating groups of data. It provides a set of interfaces and classes that support a wide variety of data structures including lists, sets, queues, and maps.

Collection Framework Hierarchy (Textual Diagram)

java.util.Collection (Interface)
|
|-- List (Interface)
|   |-- ArrayList (Class)
|   |-- LinkedList (Class)
|   |-- Vector (Class)
|       |-- Stack (Class)
|
|-- Set (Interface)
|   |-- HashSet (Class)
|   |-- LinkedHashSet (Class)
|   |-- TreeSet (Class)
|
|-- Queue (Interface)
    |-- PriorityQueue (Class)
    |-- ArrayDeque (Class)

java.util.Map (Interface)
|
|-- HashMap (Class)
|-- LinkedHashMap (Class)
|-- TreeMap (Class)
|-- Hashtable (Class)
    |-- Properties (Class)

2. Core Interfaces and Classes

2.1 Collection Interface

  • This is the root interface in the collection hierarchy and is extended by the List, Set, and Queue interfaces.
  • It defines the basic methods such as add(), remove(), size(), clear(), contains(), and iterator().
  • It cannot be instantiated directly, but it defines a common set of functionalities for all collection classes.

3. List Interface and Implementations

3.1 List

  • A List is an ordered collection that allows duplicate elements.
  • It provides positional access and insertion of elements.
  • Lists are ideal when the order of elements matters and you need to access elements by index.

3.1.1 ArrayList

  • Implements a resizable array, which grows as elements are added.
  • Provides constant-time performance for the get() and set() operations.
  • Slower than LinkedList when inserting or removing elements from the middle.
  • Best suited for scenarios where frequent read operations are needed with fewer insertions/deletions.

3.1.2 LinkedList

  • Uses a doubly-linked list data structure.
  • Allows constant-time insertions or removals using iterators.
  • Slower than ArrayList for accessing elements by index.
  • Suitable for applications that require frequent addition/removal of elements from the start or middle of the list.

3.1.3 Vector (Legacy)

  • Similar to ArrayList but synchronized, making it thread-safe.
  • Due to synchronization, it has a performance overhead and is mostly replaced by ArrayList or thread-safe alternatives.
  • Used in legacy applications where thread safety was a primary requirement.

3.1.4 Stack

  • A subclass of Vector that implements a LIFO (Last-In-First-Out) stack.
  • Supports operations like push(), pop(), and peek().
  • Useful for undo features, parsing expressions, and recursion-based operations.

4. Set Interface and Implementations

4.1 Set

  • A Set is a collection that does not allow duplicate elements.
  • It is ideal when you want to store unique elements.
  • Does not guarantee the order of elements unless specified.

4.1.1 HashSet

  • Implements a hash table for storage.
  • Does not maintain insertion order.
  • Provides constant-time performance for add(), remove(), and contains() methods.
  • Suitable for storing non-duplicate items with fast lookup times.

4.1.2 LinkedHashSet

  • Extends HashSet and maintains the order in which elements were inserted.
  • Uses a combination of hash table and linked list to maintain insertion order.
  • Ideal for scenarios where insertion order is important.

4.1.3 TreeSet

  • Implements the NavigableSet interface and uses a Red-Black Tree.
  • Maintains elements in natural order or according to a custom comparator.
  • Slower than HashSet due to sorting but provides useful navigation methods.
  • Used when sorted iteration is required.

5. Queue Interface and Implementations

5.1 Queue

  • A Queue is used to hold multiple elements prior to processing.
  • Typically follows FIFO (First-In-First-Out) order.
  • Includes special-purpose implementations like PriorityQueue and ArrayDeque.

5.1.1 PriorityQueue

  • Does not guarantee FIFO order.
  • Elements are ordered using natural ordering or a custom comparator.
  • Ideal for scheduling tasks where higher priority elements should be processed first.

5.1.2 ArrayDeque

  • Implements a resizable array for a double-ended queue (deque).
  • Supports insertion and removal at both ends with constant-time performance.
  • More efficient than Stack and LinkedList for stack/queue operations.

6. Map Interface and Implementations

6.1 Map

  • A Map is a collection that maps keys to values.
  • Each key must be unique, but values can be duplicated.
  • Provides methods like put(), get(), remove(), containsKey().

6.1.1 HashMap

  • Implements a hash table for key-value pairs.
  • Does not guarantee any order of keys.
  • Allows one null key and multiple null values.
  • Fast lookup, insertion, and deletion.
  • Most commonly used implementation for mapping data.

6.1.2 LinkedHashMap

  • Maintains insertion order of keys.
  • Useful for maintaining predictable iteration order.
  • Often used in applications like LRU (Least Recently Used) cache implementations.

6.1.3 TreeMap

  • Implements a Red-Black Tree and stores entries in sorted key order.
  • Slower than HashMap but provides sorted traversal.
  • Useful when consistent ordering is required.

6.1.4 Hashtable (Legacy)

  • Thread-safe map that synchronizes every method.
  • Allows neither null keys nor null values.
  • Has been replaced by ConcurrentHashMap in concurrent applications.

6.1.5 Properties

  • A subclass of Hashtable designed for use with configuration files.
  • Stores key-value pairs as strings.
  • Useful for reading .properties files.