HashSet - Notes By ShariqSP
HashSet in Java
HashSet
is an implementation of the Set
interface in Java that uses a hash table to store elements. It provides constant-time performance for basic operations such as add, remove, and contains, assuming a good hash function is used.
Definition:
A HashSet
is a collection that does not allow duplicate elements. It uses hashing to store elements and provides constant-time performance for basic operations such as add, remove, and contains. Elements are stored in a hash table, which ensures fast retrieval and efficient storage.
Methods:
- add(E element): Adds the specified element to the set if it is not already present.
- remove(Object obj): Removes the specified element from the set if it is present.
- contains(Object obj): Returns true if the set contains the specified element.
- size(): Returns the number of elements in the set.
- isEmpty(): Returns true if the set contains no elements.
- clear(): Removes all elements from the set.
Example:
import java.util.HashSet;
public class Main {
public static void main(String[] args) {
// Create a HashSet of strings
HashSet set = new HashSet<>();
// Add elements to the set
set.add("Java");
set.add("Python");
set.add("C++");
// Display the elements
System.out.println("Elements in the set:");
System.out.println(set);
// Remove an element
set.remove("Python");
// Display the modified set
System.out.println("\nElements after removal:");
System.out.println(set);
}
}
In this example, a HashSet
is created to store strings. Elements are added to the set using the add()
method and removed using the remove()
method.
Hashing in HashSet
HashSet is an implementation of the Set interface in Java that uses hashing to store and retrieve elements efficiently. Hashing is a technique that converts an object into an integer value called a hash code. This hash code is then used to determine the index or bucket where the object should be stored in the HashSet's underlying array.
Hash Function
The process of generating a hash code from an object is performed by a hash function. In Java, every object inherits a hashCode()
method from the Object
class, which returns an integer hash code for the object. The default implementation of hashCode()
method typically computes the hash code based on the object's memory address.
Bucketization
HashSet typically uses an array of buckets to store elements. Each bucket is associated with a hash code range, and elements with similar hash codes are stored in the same bucket. When an element is added to the HashSet, its hash code is used to determine the bucket where it should be stored. If there are multiple elements with the same hash code (a collision), they are usually stored in a linked list or a balanced tree within the bucket.
Load Factor and Rehashing
Similar to HashMap, HashSet also has a load factor that determines when the underlying array needs to be resized. When the number of elements in the HashSet exceeds a certain threshold based on the load factor, the HashSet is resized, and all elements are rehashed to new buckets. This process is known as rehashing and helps maintain efficient performance as the size of the HashSet grows.
Overall, hashing in HashSet allows for fast insertion, deletion, and lookup operations, making it a popular choice for scenarios where uniqueness of elements and efficient set operations are required.
Advantages and Disadvantages of HashSet
Advantages:
- Fast Lookup: HashSet provides constant-time performance for basic operations such as adding, removing, and checking for the presence of elements. The time complexity for these operations is O(1) on average, making HashSet efficient for large datasets.
- Uniqueness of Elements: HashSet ensures that all elements stored within it are unique. It automatically eliminates duplicates when adding elements, making it suitable for scenarios where uniqueness is a requirement, such as storing unique values in a collection or performing set operations.
- Hashing: HashSet uses hashing to organize and retrieve elements efficiently. Hashing allows for quick determination of the bucket in which an element should be stored or searched for, resulting in fast access times even for large collections.
- Implements Set Interface: HashSet implements the Set interface, providing methods for performing set operations such as union, intersection, and difference. It allows developers to work with sets in a familiar and standardized way, simplifying code maintenance and interoperability.
Disadvantages:
- Unordered: HashSet does not maintain any particular order of elements. While this allows for fast lookup and insertion, it also means that the order in which elements are stored may not be preserved. If order preservation is required, other data structures like LinkedHashSet can be used instead.
- Memory Overhead: HashSet requires additional memory to store the hash codes and maintain the hash table structure. This overhead can become significant when dealing with large datasets, especially if the load factor is set too low, leading to increased memory consumption.
- Not Thread-safe: HashSet is not thread-safe by default. If multiple threads access a HashSet concurrently, it can lead to data corruption or undefined behavior. Synchronization mechanisms need to be applied externally to ensure thread safety if needed, increasing complexity and potentially impacting performance.
- No Random Access: HashSet does not provide direct access to elements by index. While it offers efficient lookup operations, there is no way to retrieve elements by their position in the set. If random access is required, other data structures like ArrayList or LinkedList may be more suitable.
Interview Questions and MCQs on HashSet in Java
Interview Questions:
- What is a HashSet in Java?
- How does HashSet store elements internally?
- What is the difference between HashSet and TreeSet?
- How do you add elements to a HashSet?
- Can a HashSet contain duplicate elements?
- How do you remove elements from a HashSet?
- Explain the concept of hashing in HashSet.
- What is the purpose of the equals() and hashCode() methods in HashSet?
- How do you check if a specific element exists in a HashSet?
- What are the advantages and disadvantages of using HashSet?
Multiple Choice Questions (MCQs):
- Which interface does HashSet implement?
a) List
b) Set
c) Map
d) Collection
Answer: b) Set - Which of the following statements is true about HashSet?
a) Elements are stored in insertion order
b) Elements are sorted in natural order
c) Elements are stored based on their hash codes
d) Elements are stored based on their indexes
Answer: c) Elements are stored based on their hash codes - What happens when you try to add a duplicate element to a HashSet?
a) The element is added without any changes
b) An exception is thrown
c) The existing element is replaced with the new one
d) The new element is ignored
Answer: a) The element is added without any changes - Which method is used to remove all elements from a HashSet?
a) removeAll()
b) clear()
c) deleteAll()
d) erase()
Answer: b) clear() - How do you check if a HashSet is empty?
a) isEmpty()
b) size()
c) isNotEmpty()
d) hasElements()
Answer: a) isEmpty()