Introduction
In Python programming, we sometimes need to represent a group of values where the order of elements does not matter, and duplicates should be removed automatically. Lists and tuples allow duplicates and preserve order, but certain situations require the exact opposite. For example, if you are sending SMS messages to a group of students and your list contains duplicate phone numbers, you do not want to send the same message multiple times to the same student. In such cases you need a data structure that guarantees only unique values and does not care about ordering. This is where the set data structure becomes the ideal choice.
A set in Python is modeled on the basic concepts of set theory from mathematics. In mathematics, sets store only unique values and do not preserve any particular order. Python follows the same rules.
When Should You Use a Set?
Before learning how a set works, it is important to know when a set is the correct choice.
A set is preferred when:
- Order is not important
- Duplicates are not allowed
- You want to store unique values only
- Indexing and slicing are not required
- Fast membership testing is needed
If these requirements match your scenario, a set is the most suitable Python data structure.
Example scenario: You want to collect unique phone numbers from a large list, ignoring duplicates and not worrying about the order of final output.
Representing a Set in Python
Basic Set Creation
A set is represented using curly braces:
s = {10, 20, 30, 40}
To confirm it is a set:
print(type(s))
Output:
<class 'set'>
The curly braces indicate the set data type, but only when they contain at least one element.
Duplicate Values
If duplicate values are included, Python automatically removes them because sets only store unique elements.
Example:
s = {10, 20, 10, 20, "Durga", 30, 40}
print(s)
Even though the values 10 and 20 appear multiple times in the declaration, the final printed set will contain them only once. However, the order in which the values appear in the output is unpredictable. Python may print them in any order because sets do not maintain any positional arrangement.
Example output (actual order may vary):
{40, 10, 'Durga', 20, 30}
The main points:
- Duplicate values are removed
- Output order is not guaranteed
Important Behavior of Sets
Sets Do Not Preserve Order
If you ask what the first or last element of a set is, the question itself is invalid because sets do not maintain order. You cannot rely on any predictable position.
Example:
s = {10, 20, 30}
There is no concept such as s[0] or s[-1]. Attempting to do this will cause an error.
Indexing Is Not Supported
s[0]
Results in:
TypeError: 'set' object does not support indexing
Slicing Is Not Supported
s[1:4]
Results in:
TypeError: 'set' object is not subscriptable
These restrictions exist because indexing and slicing both rely on an ordered sequence, which sets do not provide.
Heterogeneous Elements Are Allowed
Just like lists, tuples, and dictionaries, sets can also contain elements of different data types.
Example:
s = {10, 20.5, "Durga", True, None}
Python does not impose type restrictions inside a set.
Modifying a Set
Even though sets do not preserve order, they are mutable. You can add new elements or remove existing ones after creation.
Adding Values
The method for adding elements to a set is add(), not append().
Example:
s = {10, 20, 30, 40}
s.add(50)
print(s)
The element 50 will be added, but not necessarily at the end. Since sets do not maintain any order, Python will insert the value in a position determined by its internal hashing mechanism.
Removing Values
Use remove() to delete an element:
s.remove(30) print(s)
After removing 30, the set will no longer contain that value.
Why append() Is Not Used for Sets
The append() method is used for lists because new elements are always added at the end of the list. This assumption does not hold for sets. Since sets do not have a concept of the last position or any specific arrangement, Python uses the add() method instead. The name add() indicates that a value is inserted into the set without implying a position.
Creating an Empty Set
This is a very important point and a common beginner mistake.
Incorrect Method
Using empty curly braces creates an empty dictionary, not an empty set:
s = {}
print(type(s))
Output:
<class 'dict'>
This happens because curly braces are used for both dictionary and set literals, but Python gives priority to dictionaries when no elements are present. Dictionary is more commonly used, so Python treats {} as a dictionary by default.
Correct Method for Empty Set
To create an empty set, you must use the set() function:
s = set() print(type(s))
Output:
<class 'set'>
This is the only correct way to create an empty set.
Difference Between List and Set
Ordering
- List preserves order
- Set does not preserve order
Duplicates
- List allows duplicates
- Set does not allow duplicates
Representation
- List uses square brackets []
- Set uses curly braces {}
Indexing and Slicing
- List supports indexing and slicing
- Set does not support indexing or slicing
Use Cases
- Use a list when order matters and duplicates may exist
- Use a set when uniqueness is required and order does not matter
Summary
A set is a powerful data structure used when your primary goal is to maintain unique elements without caring about their order. Sets automatically remove duplicates and provide efficient membership tests. They are mutable, allowing you to add or remove items, but they do not allow indexing or slicing. Curly braces are used to represent sets, except in the case of empty sets, where the set() function must be used. When your requirement demands uniqueness and non-ordered data, the set is the best choice compared to lists or tuples.
