Python Sets: A Complete Guide

Python Sets: A Complete Guide

Written by Johnny on Oct 1st, 2022 Views Report Post

Sets in python provide a method to create a unique set of unordered items with no duplicates. Their main use case is for checking is an item exists in a set of items, which can be useful in many different situations.

Creating a set is pretty easy, and is kind of similar to how we define lists in Python. The only difference, is we use {} curly brackets to define a set:

mySet = { "some", "set", "of", "items" }

Sets can also be defined from lists using the set() function:

mySet = set([ 'some', 'list', 'becoming', 'a', 'set' ])
# set is { 'some', 'list', 'becoming', 'a', 'set' }

You can also create sets from strings, using the same set() function:

mySet = set('somestring')
# set is { 's', 'o', 'm', 'e', 's', 't', 'r', 'i', 'n', 'g' }

As with other countable types of data, we can use len to get the length of a set, too:

let mySet = set([ 'some', 'list', 'becoming', 'a', 'set' ])
print(len(mySet)) # Returns 5

Finally, we can also define what is known as a frozenset, which is simply an immutable, unchangeable version of a set with fixed value, using the frozenset() function:

let mySet = frozenset([ 'some', 'list', 'becoming', 'a', 'set' ])

Combining and Intersecting Sets

We can combine two sets into one using the | operator. If an item exists in both sets, only one copy of it will be brought over. Here's an example where we combine two sets:


mySet = { "set", "one" }
myNewSet = { "set", "two" }

combinedSet = mySet | myNewSet
print(combinedSet) # { "set", "one", "two" }

We can intersect sets using &. That means we'll end up with a set where the items are only items which exist in both. Using the same example, we can therefore create a set only containing the item set:

mySet = { "set", "one" }
myNewSet = { "set", "two" }

combinedSet = mySet & myNewSet
print(combinedSet) # { "set" }

Another way we can combine sets is by subtraction, o end up with a new set which only contains items left when removing any common items in both sets. For example, the new set below only has one item - cool, since mySet and mySecondSet both contain "set" and "one":

mySet = { "set", "one", "cool" }
mySecondSet = { "set", "one" }
myNewSet = mySet - mySecondSet
print(myNewSet) # { "cool" }

Finally, we can do what is called symmetric difference, where we end up with a set that contains items found in either mySet or mySecondSet, but not both:

mySet = { "set", "one", "cool", "nice" }
mySecondSet = { "set", "one", "friendly" }
myNewSet = mySet ^ mySecondSet
print(myNewSet) # { "cool", "nice", "friendly" }

Testing Membership using Sets

The main use case for sets is testing membership, to see if an item exists in a set. We can do this using the in and not in keywords. Let's look at an example. If we want to check orange is in our fruits set, we use in:

fruits = { "orange", "apple", "peach" }
print("orange" in fruits) # True

Or, if we want to check if orange is not in fruits, we use not in:

fruits = { "orange", "apple", "peach" }
print("orange" not in fruits) # False

Making a copy of a set

As with lists, we can make a copy of a set using the copy() method attached to all sets. This will not change the value, but will change the reference in memory for this new set. That means that if compared by value using ==, the sets will be the same, when compared by reference using is, the sets will not be the same:

mySet = { "set", "one" }
mySetCopy = mySet.copy();

print(mySet == mySetCopy) # True
print(mySet is mySetCopy) # False

Testing for Supersets and Subsets

Another really useful use case for sets is the ability to check if a set is a superset or subset of another set (which is a bit of a tongue twister):

  • subsets will be sets that are fully contained within another set.
  • supersets will be sets that contain fully the members of another set.

Checking for Subsets in Python

Let's say we have two sets, as shown below:

mySet = { "set", "one", "two" }
mySecondSet = { "set", "one" }

mySecondSet, is in fact a subset of mySet, since it is fully contained within mySet. We can test for this using the <= operator:

mySet = { "set", "one", "two" }
mySecondSet = { "set", "one" }

print(mySecondSet <= mySet) # True

We can also use the < operator to check for true subsets, meaning that mySecondSet is contained within mySet, but is not equal in value to mySet. In the example above, this is also true:

mySet = { "set", "one", "two" }
mySecondSet = { "set", "one" }

print(mySecondSet < mySet) # True

In the following example, however, mySecondSet is indeed a subset of mySet, but it is not a true subset, since both are equal in value:

mySet = { "set", "one", }
mySecondSet = { "set", "one" }

print(mySecondSet <= mySet) # True
print(mySecondSet < mySet) # False

Checking for Supersets in Python

Super sets work exactly the same way as subsets - the only difference is the arrow is the opposite way around. So > is used to check for true supersets, while >= is used to check for any supersets. Using our example from before, mySet is a superset of mySecondSet - so the following returns true:

mySet = { "set", "one", "two" }
mySecondSet = { "set", "one" }

print(mySet > mySecondSet) # True

And similarly, while mySet is a superset of mySeconSet below, it is not a true superset, so > does not return true, while >= does:

mySet = { "set", "one", }
mySecondSet = { "set", "one" }

print(mySet >= mySecondSet) # True
print(mySet > mySecondSet) # False

Testing if two sets have completely different values in Python

Sometimes, you'll also want to check if two sets are completely original when compared to each other. For example, { "one", "two" }, and { "three", "four" } are two sets with unique values when compared to each other. In Python, the isdisjoint function allows us to accomplish that:

mySet = { "one", "two", }
mySecondSet = { "three", "four" }

print(mySet.isdisjoint(mySecondSet)) # True

Other Set Methods

While everything we've talked about so far apply both to frozensets and sets, there are also a few other methods available to sets, which allow us to mutate their value. These are:

  • set.add('item') - adds an item to the set.
  • set.remove('item') - removes an item to the set.
  • set.update(newSet) - adds all items from newSet to the original set. This can also be written as set |= newSet
  • set.clear() - removes all items from a set
  • set.pop(4) - removes the 4th item from a set, or the last item if no number is specified
  • set.intersection_update(newSet) - keeps only items found in both set and newSet. Can also be written as set &= newSet
  • set.difference_update(newSet) - takes set, and removes any items found in newSet. Can also be written as set -= newSet
  • set.symmetric_difference_update(newSet) - keeps only found in either set and newSet, but not both. Can also be written as set ^= newSet

While the first 5 provide easy ways to add and remove items from sets, the last 3 are the same as what we talked about before when we covered intersecting and combining sets. The difference here, is we can use these functions to change the set itself. While this is possible on normal sets, we cannot apply these methods to a frozenset.

Conclusion

That should be everything you need to know about sets in Python. I hope you've enjoyed this guide. I've also written more about all of the different data structures available in Python here. If you've enjoyed this guide, you might also enjoy my other engineering content here.

Thanks for reading! You can learn more about Python data collections below:

Comments (0)