Introduction to Python Programming - Data Structures
Table of Contents
Overview
Data structures are containers that enable storage, grouping, access, iteration, and manipulation of basic data types. Data structures include Sequences
, Set
, and Dictionary
.
Sequences
A sequence is a container holding data (elements) in an ordered manner. Sequence types include list
, str
, tuple
, and bytes
. Among them, only list
is mutable while the rest are immutable. The sequence below contains 10 integer elements (n = 10)
.
The element at the first position takes an index value of 0 while the last takes an index value of n - 1
, 9.
Note:
str
refers to strings.
Operations on Sequences
-
You can check if an element exists in a sequence using the
in
keyword or operator. The result is the boolean valueTrue
orFalse
. You can also do this for other containers like Set and Dictionary. For examplesequence = [99, 22, 43, 40, 85, 66, 17, 81, 2, 8] 40 in sequence # True 82 in sequence # False
If you put the code in a Python file (.py), use the
print
statement to see the output. For exampleprint(40 in sequence)
. At a Python REPL, you can run the code as is. -
You can access the elements of a sequence using the integer indices. This operation, known as indexing is fast. For example, to get the fifth (5) element at index position four (4), you would write
sequence[4] #85
To retrieve a slice of elements, specify the start and end index.
For example, the code below will return a slice of the first to the fifth element.
sequence[0:5] # [99, 22, 43, 40, 85]
Note that the sixth element (at index 5) is excluded. You may shorten the above statement to
sequence[:5]
by omitting the start index. You may also omit the end index in certain cases. Omitting the start index will return a slice from the start index to the rest elements. For examplesequence[3:]
will yield[40, 85, 66, 17, 81, 2, 8]
. Also, omitting the start and end index together (sequence[:]
) will return a slice containing all elements. -
Sequences also support iteration. You can go through each element in a sequence using a
for
loop statement (you will see more loops in my next article). For example, in thefor
loop below, we multiply each element by three (3) and print the results.for element in sequence: print(3 * element)
-
Other operations on sequences include counting using the
len()
function; reversing, appending, and popping usingreverse()
,append()
andpop()
methods/functions in the case of mutable sequences. Inlist
andstr
, you can count the number of times an element appears using thecount()
method. You can find an element's position using thefind()
orindex()
method if the method(s) is available on the sequence. For examplelen(sequence) # 10 a_string_is_also_a_sequence = "Hello reader!" len(a_string_is_also_a_sequence) # 13 sequence.index(81) #7 meaning 81 is at index 7 a_string_is_also_a_sequence.count("l") #2, letter l occurs twice
Let's look at the sequence types starting with the list
data structure.
List
The list
data structure is an ordered sequence of elements enclosed in square brackets ([]
). The content could be any data type or data structure, including a list. Since lists are mutable, adding or removing elements modifies the list.
Creation
Use the literal square brackets []
or the list()
function. Below is a list of names, numbers, two empty lists, and so on.
names = ["John", "Doe", "Peter", "Griffin", "Lois", "Lane"]
numbers = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
empty_1 = []
empty_2 = list()
# you can convert a sequence to a list using the list function
list_of_letters_abcd = list("abcd")
# The string method split creates a list of strings
list_of_letters_efgh = "e f g h".split(" ")
A list can contain other lists (nesting). For example
nested_list = [names, numbers]
Note: A list can contain a mixture of types e.g.
["a", 1, [], 2.3]
. However, you should not do this as it makes working with such a list difficult.
Operations
Indexing (Access) and Slicing
As mentioned above, you can retrieve the elements of a list by indexing and slicing.
# Indexing
# The first element or name
names[0] # 'John'
# The third name starting from the back
names[-3] # 'Griffin'
# Slicing
# The first three numbers
numbers[0:3] # [10, 20, 30]
# same as above(we can omit the start index)
numbers[:3] # [10, 20, 30]
# The third number to the end
numbers[2:] # [30, 40, 50, 60, 70, 80, 90, 100]
Modifying (Mutation)
# Replace the first element 'John' with 'James'
names[0] = "James"
# Replace 'Lois' with 'Mary'
names[-2] = "Mary"
names # ['James', 'Doe', 'Peter', 'Griffin', 'Mary', 'Lane']
Deletion
To delete an element from a list, use the del
keyword. For example, del some_list[0]
will remove the item in the first position. Afterward, the list is re-ordered so that the element initially at position 2 (index 1) takes position 1 (index 0), and so on.
Copying
Assigning a list variable to a new variable does not create a copy. The new variable refers to the original one. Python does what we call referencing behind the scenes. Referencing saves memory. For example,
numbers_a = numbers
numbers_a == numbers # True
We can check if both lists point to the same object using the is
keyword. For example
numbers_a is numbers # True
numbers_a
and numbers
are equal and the same. Modification to either will reflect in both variables. Let us add 110 at the end of the list using numbers_a
.
# We add 110 to the numbers_a list
numbers_a.append(110)
Print the content of the list through the numbers
variable.
# numbers also contains the '110' added to numbers_a
numbers # [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110]
In cases where you need to make a copy of the original list, use either a full slice [:]
, the list()
function, or the copy()
method.
#1 Copying with a full slice
a = numbers[:]
# Copying the list function
b = list(numbers)
# Use the copy method
c = numbers.copy()
Concerning content, the new copy is equal to the original list. However, the lists are not identical.
# Do we have the same elements?
a == numbers # True
b == numbers # True
c == numbers # True
# Are we identical?
a is numbers # False
b is numbers # False
c is numbers # False
a
and numbers
are equal but not the same. They are still equal because they contain the same elements. Variables b
and c
are also equal but not identical.
Addition and Multiplication
You can combine two or more lists with the +
operator. For example
[1, 2] + [3, 4] # [1, 2, 3, 4]
You can also use the *
operator to generate a new list containing repetitions of the elements of the old list. For example
[1] * 5 # [1, 1, 1, 1, 1]
[1, 2] * 5 # [1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
Some List Methods
List methods include but are not limited to append()
, pop()
, reverse()
, index()
, and count()
.
# append: add elements to the end/back of the list
names.append('Owen')
names.append('Wilson')
names # ['James', 'Doe', 'Peter', 'Griffin', 'Mary', 'Lane', 'Owen', 'Wilson']
# pop: remove and return the last element from a list
# Specify an index to remove the item in that position
# The removed item is returned (can be stored in a variable)
names.pop() # 'Wilson'
names # ['James', 'Doe', 'Peter', 'Griffin', 'Mary', 'Lane', 'Owen']
# pop(index): remove and return an element at a given index name.pop(0) # 'James'
names # ['Doe', 'Peter', 'Griffin', 'Mary', 'Lane', 'Owen']
# reverse: Reverse the positions of the element
# before reversal
numbers # [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110]
numbers.reverse()
# after reversal
numbers # [110, 100, 90, 80, 70, 60, 50, 40, 30, 20, 10]
# index: get the position/index of an element.
# returns a ValueError if the element is not found
names.index('Lane') # 4
# count: how many "Peter" do we have in the names list
names.count('Peter') # 1
# Use the len function to determine the number of elements in a list
len(numbers) # 10
Tuples
Tuples are very similar to lists except that they are immutable and their elements are enclosed within a parenthesis ()
. Prefer tuple to list if you do not want the content of your collection to change.
Creation
Use the literal parenthesis ()
or the tuple()
function.
# Literally
empty_tuple = ()
my_tuple = (1, 2, 3)
# you do not always need the parenthesis
my_other_tuple = 4, 4, 5, 5, 5
d = 1,
# from sequences such as string, list
names_tuple = tuple(names)
numbers_tuple = tuple(numbers)
Note that a single element in ()
without a comma ,
will not give you a tuple.
not_a_tuple = (2)
type(not_a_tuple) # <class 'int'>
names_tuple # ('Doe', 'Peter', 'Griffin', 'Mary', 'Lane', 'Owen')
numbers_tuple # (110, 100, 90, 80, 70, 60, 50, 40, 30, 20, 10)
my_other_tuple # (4, 4, 5, 5, 5)
Operations
You can retrieve elements of a tuple via indexing
and slicing
.
names_tuple[2] # 'Griffin'
names_tuple[:4] # ('Doe', 'Peter', 'Griffin', 'Mary')
However, you cannot modify a tuple.
names_tuple[0] = 'Fail'
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-106-f0d527e00ac0> in <module>
----> 1 names_tuple[0] = 'Fail'
TypeError: 'tuple' object does not support item assignment
Like all sequences, you can count the number of elements in a tuple using the len
function.
# count
len(names_tuple) # 6
You can also use the list()
function to create a new list from a tuple.
# convert to list
list(names_tuple) # ['Doe', 'Peter', 'Griffin', 'Mary', 'Lane', 'Owen']
Like in a list, the +
and *
operators perform similar actions.
# Combine tuples: create a new one
my_tuple + my_other_tuple # (1, 2, 3, 4, 4, 5, 5, 5)
# Repeat the elements and create a new tuple
my_tuple * 5 # (1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3)
Tuple unpacking
You can assign the elements of a tuple to variables using tuple unpacking. For example, the variable my_tuple
contains three elements.
my_tuple # (1, 2, 3)
To store these three elements in three different variables, you would write
# unpack my_tuple
first, second, third = my_tuple
first # 1
second # 2
third # 3
Tuple Methods
index()
and count()
are examples of methods/functions of a tuple.
# How many 5s do we have in the my_other_tuple variable
my_other_tuple.count(5) # 3
# Mary is at which position/index?
names_tuple.index('Mary') # 3
Check out my previous article on data types where I discussed
strings
briefly. Note that thebytes
sequence is similar to strings. So, I will not discuss it in this article :)
Sets
A set is a mutable collection of unique elements enclosed in curly braces ({}
). However, unlike lists, tuples, and strings, sets are not ordered. So you cannot index into or slice sets.
Creation
Use either literal {}
or the set
function.
letters_set = {'a', 'b', 'c', 'c', 'a', 'b', 'c'}
numbers_set = set([1, 2, 2, 3, 3, 5, 5, 4, 4, 2, 5])
empty_set = set()
not_an_empty_set = {}
type(empty_set) # set
type(not_an_empty_set) # dict
Note that an empty
{}
gives an empty dictionary and not an empty set. Useset()
to create an empty set.
numbers_set # {1, 2, 3, 4, 5}
letters_set # {'a', 'b', 'c'}
Set Operations
Let's create some sets to perform some set operations.
set_1 = {1, 2, 3, 4, 5}
set_2 = {1, 2, 3}
set_3 = {4, 5}
set_4 = {6, 7}
Union: all elements in both sets
set_1.union(set_2) # {1, 2, 3, 4, 5}
set_1.union(set_4) # {1, 2, 3, 4, 5, 6, 7}
Intersection: Only elements that appear in both sets
set_1.intersection(set_2) # {1, 2, 3}
set_1.intersection(set_3) # {4, 5}
Difference: elements present in the first set but not in the second set
set_1.difference(set_2) # {4, 5}
set_3.difference(set_4) # {4, 5}
Set Methods
add
, update
, union
, intersection
, and difference
are some examples of set methods. See python documentation for more information.
Update: fill a set with the elements of another set
set_1.update(set_4)
set_1 # {1, 2, 3, 4, 5, 6, 7}
Add: Add element(s) to a set
set_1.add(8)
set_1.add(9)
set_1.add(9)
set_1.add(10)
set_1 # {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
Remove an element from a set
# remove 10. Then remove 1
set_1.remove(10)
set_1.remove(1)
set_1 # {2, 3, 4, 5, 6, 7, 8, 9}
Dictionaries
Just like the English dictionary, a Python dictionary is a collection of keys and their values (key-value pair) enclosed in curly braces ({}
). Like sets, dictionaries are mutable and unordered
Note: Dictionary keys must be immutable (for example, strings, and integers). Therefore, a mutable type (such as a list) cannot be a dictionary key. Dictionary values can be of any data type, including a dictionary.
Creation
Use the literal curly bracket {}
in the following format { "key": value }
or use the dict
function.
# Literal
items = { "pen": 30, "chalk": 20, "book": 45 }
empty_dict = {}
# Use the dict function
same_items = dict(pen=30, chalk=20, book=45)
# dict function taking a list of tuples
its_all_i_see = dict([("pen", 30), ("chalk", 20), ("book", 45)])
empty_dict_2 = dict()
items # {'pen': 30, 'chalk': 20, 'book': 45}
same_items # {'pen': 30, 'chalk': 20, 'book': 45}
its_all_i_see # # {'pen': 30, 'chalk': 20, 'book': 45}
type(items) # dict
key_types = {"string": 1, 1 : 5, (5,): "tuple" }
key_types # {'string': 1, 1: 5, (5,): 'tuple'}
A mutable key like a list, set, or dictionary will result in an error.
key_types_fail = { [1, 2]: "whatever" }
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-163-59389ba2d9af> in <module>
----> 1 key_types_fail = { [1, 2]: "whatever" }
TypeError: unhashable type: 'list'
Operations
Access and Modification
Use dictionary[key]
or dictionary.get(key)
to retrieve the value associated with the key.
items["pen"] # 30
items.get("pen") # 30
# Return the value 0 if there is no pen.
items.get("pen", 0)
dictionary[key]
throws an error if the key does not exist. This may crash your program. Prefer .get(key)
as it returns None
if the key is not present. With the get()
method, you can also specify a default value to return if the key is present
Use dictionary[key] = value
to add a new item to a dictionary or to modify an existing key with a new value. For example:
# Add a new item table to the items
items["table"] = 100
items # # {'pen': 30, 'chalk': 50, 'book': 45, 'table': 100}
# change the value of chalk to 50. The original was 20
items["chalk"] = 50
items # {'pen': 30, 'chalk': 50, 'book': 45, 'table': 100}
Deletion
# Remove table from the dictionary
del items['table']
Addition and Multiplication
The dictionary type does not support the +
and *
operators.
del dictionary[key]
throws a KeyError if the key is not in the dictionary.
Dictionary Methods
Some dictionary methods are keys
, values
, clear
, items
, and update
.
same_items # {'pen': 30, 'chalk': 20, 'book': 45}
# get a list of keys only using the keys() method
same_items.keys() # dict_keys(['pen', 'chalk', 'book'])
# the values only
same_items.values() # dict_values([30, 20, 45])
# The items method gives you a tuple of the keys and values of the dictionary
same_items.items() #dict_items([('pen', 30), ('chalk', 20), ('book', 45)])
# clear the dictionary
same_items.clear()
# Now empty
same_items # {}
# add the elements from another dictionary
same_items.update(items)
same_items # {'pen': 30, 'chalk': 50, 'book': 45}
Conclusion
In this article, we have looked at some data structures. In the next, we will look at control flow in Python. Please follow me on Twitter to get notified whenever I release a new article. Thanks for reading!