Data Structures 1: Lists and Tuples

In this tutorial, we will focus on basic data structures in Python: lists and tuples. Data structures can be seen as containers because they allow you to store, organize, and access data. Lists and tuples are sequential containers: the elements they contain are ordered, and their position is recorded in an index.

Lists

Definition

In the previous tutorial, we saw that strings were sequences of characters. Lists are also sequences, meaning ordered series of elements, but more general: the elements can be of different types.

Lists are constructed with square brackets [], and the elements of the list are separated by commas.

Let’s assign a first list to a variable a:

a = [1, 2, 3]
print(a)
[1, 2, 3]

The list a consists of integers, but a list can practically contain objects of any type.

b = ["a sequence", 56, "d"]
print(b)
['a sequence', 56, 'd']

It is also possible to create lists of lists (and so on), which allows for creating hierarchical data structures.

c = ["a sequence", 56, ["this list is nested", 75, "o"]]
print(c)
['a sequence', 56, ['this list is nested', 75, 'o']]

A nested list can also be constructed from already defined lists.

item1 = ["coffee", "500g"]
item2 = ["biscuits", "20"]
item3 = ["milk", "1L"]
inventory = [item1, item2, item3]
print(inventory)
[['coffee', '500g'], ['biscuits', '20'], ['milk', '1L']]

However, in the next tutorial, we will see that dictionaries are generally more suitable data structures than lists for representing hierarchical data.

List Length

Like strings, you can use the len function to count the number of elements in a list.

d = ["this", "is", "a", "list"]
len(d)
4

Indexing

Since lists are sequences, they are indexed similarly to strings. It is important to remember that position numbering starts at 0 in Python.

# Third element of the list a
print(a[2])
3

Of course, it is not possible to request an element that does not exist. Python returns an error indicating that the requested index is out of bounds.

print(a[5])
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[46], line 1
----> 1 print(a[5])

IndexError: list index out of range

To index a list contained within another list, use double indexing.

# First element of the sublist at the second position of list c
print(c[2][0])
this list is nested

In terms of indexing, everything that was possible with strings is also possible with lists.

# All elements from the 1st position
print(b[1:])
[56, 'd']
# Reverse a list
print(a[::-1])
[3, 2, 1]

Modifying Elements

It is possible to modify elements of a list manually, with a syntax similar to variable assignment.

# Reassign an element
d = [1, 2, "toto", 4]
d[2] = 3
print(d)
[1, 2, 3, 4]
# Substitute an element
a = [1, 2, 3]
b = ["do", "re", "mi"]
b[0] = a[2]
print(b)
[3, 're', 'mi']

Deleting Elements

The del statement allows you to delete an element by position. The elements that were after the deleted element then have their index reduced by 1.

e = [1, "do", 6]
print(e)
print(e[2])

del e[1]
print(e)
print(e[1])
[1, 'do', 6]
6
[1, 6]
6

Some Useful Properties

Again, we find properties inherent to sequences.

# Concatenation
[1, 2, 3] + ["a", 12]
[1, 2, 3, 'a', 12]
# Replication
["a", "b", "c"] * 3
['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c']

Some Useful Methods

As with strings, lists have many built-in methods, which are used in the format object.method(parameters). The most useful ones are presented below; other methods will be used in the end-of-section exercises.

# Add an element
a = [1, 2, 3]
a.append(4)
print(a)
[1, 2, 3, 4]
# Delete an element by position
b = ["do", "re", "mi"]
b.pop(0)
print(b)
['re', 'mi']
# Delete an element by value
b = ["do", "re", "mi"]
b.remove("mi")
print(b)
['do', 're']
# Reverse a list
l = [1, 2, 3, 4, 5]
l.reverse()
print(l)
[5, 4, 3, 2, 1]
# Find the position of an element
b = ["a", "b", "c", "d", "e"]
b.index("d")
3

Tuples

Definition

Tuples are another basic data structure in Python, similar to lists in their functionality. However, there is a fundamental difference: while the elements of a list can be modified by position as we saw earlier, tuples are immutable. Thus, the elements of a tuple cannot be changed without completely redefining the tuple.

When is it relevant to use a tuple rather than a list? In practice, tuples are much less frequently used than lists. Tuples are generally used to store data that is not meant to be modified during the execution of our Python program. This helps prevent data integrity issues, i.e., unwanted modification of input data. This can sometimes save long and tedious debugging sessions.

Another minor difference is that tuples are written with parentheses instead of square brackets. The different elements are still separated by commas.

x = (1, 2, "mi", "fa", 5)
x
(1, 2, 'mi', 'fa', 5)

To clearly distinguish from the normal use of parentheses (in calculations or to delimit expressions), a tuple with a single element is defined with a comma after the first element.

x1 = ("a", )
x1
('a',)

Let’s verify that it is impossible to modify or add an element to a tuple.

t = ("do", "rez", "mi")
t[1] = "re"
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[62], line 2
      1 t = ("do", "rez", "mi")
----> 2 t[1] = "re"

TypeError: 'tuple' object does not support item assignment
t = ("do", "re", "mi")
t.append("fa")
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[63], line 2
      1 t = ("do", "re", "mi")
----> 2 t.append("fa")

AttributeError: 'tuple' object has no attribute 'append'

Functionality

Tuples are indexed like lists.

print(x[0])
print(x[3:5])
1
('fa', 5)

And can also be used hierarchically.

t1 = ("a", "b", "c")
t2 = (1, 2, 3)
t3 = (t1, "and", t2)

print(t3)
print(t3[2][1])
(('a', 'b', 'c'), 'and', (1, 2, 3))
2

Tuples share some built-in methods with lists: those that do not cause modification of the object.

t = ("do", "re", "mi")
t.index("do")
0
t = ("do", "re", "mi", "re", "do")
t.count("re")
2

Conversion

The list and tuple functions allow you to convert a list to a tuple and vice versa.

tuple(["do", "re", "mi"])
('do', 're', 'mi')
list((1, 2, 3, 4, 5))
[1, 2, 3, 4, 5]

These functions have other practical uses, which we will see in exercises.

Exercises

Comprehension Questions

  • Why are lists and tuples called containers?
  • What is the common point between lists and strings?
  • How is the order of elements in a sequence recorded in Python?
  • What is the fundamental difference between a list and a tuple?
  • When is it more advantageous to use a tuple rather than a list?
  • Can you have elements of different types (e.g., int and string) in the same list? In the same tuple?
Show the solution
  • 1/ Lists and tuples are called containers because they allow storing and organizing a collection of elements of different nature in a single data structure.

  • 2/ Lists and strings are both ordered sequences of elements that can be queried by position. In the case of a string, each element is itself a string. In the case of a list, the elements can be of different nature (string, list, tuple, etc.).

  • 3/ Each element of a sequence has a unique position, called an index, which starts at 0 for the first element, 1 for the second, and so on. The elements are stored in the order they are added.

  • 4/ A list is a mutable object: you can add, delete, or modify elements of a list after its creation. In contrast, tuples are immutable: once a tuple is defined, you cannot change its elements, add or delete elements.

  • 5/ Due to their immutability, tuples are particularly suitable for storing data that you want to ensure will not be modified by mistake. For example, to store constants of an algorithm (parameters, geographical coordinates, file paths, etc.).

  • 6/ Yes, it is quite possible to have elements of different types in the same list or tuple. These elements can be of basic types (e.g., int and string), but also containers (e.g., list, tuple, dictionary, etc.).

The 4 Seasons

Create 4 lists named after the 4 seasons, each containing the names of the associated months (months of changing seasons will be attributed to the previous season). Then

create a list seasons containing the 4 lists. Try to predict what the following instructions will return (object type, number of elements, and content), then verify.

  • seasons
  • seasons[0]
  • seasons[0][0]
  • seasons[1][-1]
  • seasons[2][:3]
  • seasons[1][1:2] + seasons[-1][3:]
  • seasons[2:]
  • seasons + seasons[0]
  • seasons[3][::]
  • seasons[3][::-1]
  • seasons * 3
# Test your answer in this cell
Show the solution
spring = ["April", "May", "June"]
summer = ["July", "August", "September"]
autumn = ["October", "November", "December"]
winter = ["January", "February", "March"]

seasons = [spring, summer, autumn, winter]

l = seasons
print(type(l), len(l), l, "\n")

l = seasons[0]
print(type(l), len(l), l, "\n")

l = seasons[0][0]
print(type(l), len(l), l, "\n")

l = seasons[1][-1]
print(type(l), len(l), l, "\n")

l = seasons[2][:3]
print(type(l), len(l), l, "\n")

l = seasons[1][1:2] + seasons[-1][3:]
print(type(l), len(l), l, "\n")

l = seasons[2:]
print(type(l), len(l), l, "\n")

l = seasons + seasons[0]
print(type(l), len(l), l, "\n")

l = seasons[3][::]
print(type(l), len(l), l, "\n")

l = seasons[3][::-1]
print(type(l), len(l), l, "\n")

l = seasons * 3
print(type(l), len(l), l, "\n")
<class 'list'> 4 [['April', 'May', 'June'], ['July', 'August', 'September'], ['October', 'November', 'December'], ['January', 'February', 'March']] 

<class 'list'> 3 ['April', 'May', 'June'] 

<class 'str'> 5 April 

<class 'str'> 9 September 

<class 'list'> 3 ['October', 'November', 'December'] 

<class 'list'> 1 ['August'] 

<class 'list'> 2 [['October', 'November', 'December'], ['January', 'February', 'March']] 

<class 'list'> 7 [['April', 'May', 'June'], ['July', 'August', 'September'], ['October', 'November', 'December'], ['January', 'February', 'March'], 'April', 'May', 'June'] 

<class 'list'> 3 ['January', 'February', 'March'] 

<class 'list'> 3 ['March', 'February', 'January'] 

<class 'list'> 12 [['April', 'May', 'June'], ['July', 'August', 'September'], ['October', 'November', 'December'], ['January', 'February', 'March'], ['April', 'May', 'June'], ['July', 'August', 'September'], ['October', 'November', 'December'], ['January', 'February', 'March'], ['April', 'May', 'June'], ['July', 'August', 'September'], ['October', 'November', 'December'], ['January', 'February', 'March']] 

Doing Scales

By adding, deleting, and modifying elements, clean up the following list so that it contains the musical notes “do re mi fa sol la si” in the correct order.

l = ["do", "re", "re", "re", "fa", "sol", "solsi", "la"]

# Test your answer in this cell
Show the solution
l = ["do", "re", "re", "re", "fa", "sol", "solsi", "la"]

del l[1]  # You could also use: l.pop(1)
l[2] = "mi"
del l[5]
l.append("si")

print(l)
['do', 're', 'mi', 'fa', 'sol', 'la', 'si']

This example was simply to practice modifying and deleting elements. In practice, it would have been much simpler to directly create the correct list.

List Reversal

Propose two methods to reverse the list ["a", "random", "list"]. What is the major difference between the two methods?

# Test your answer in this cell
Show the solution
l1 = ["a", "random", "list"]
l1.reverse()
print(l1)

l2 = ["a", "random", "list"]
print(l2[::-1])
print(l2)
['list', 'random', 'a']
['list', 'random', 'a']
['a', 'random', 'list']

The reverse method modifies the list “in place”: the list is permanently reversed after executing it. In contrast, the method that reverses the list using indexing returns a new list and does not modify the existing one. To make this change permanent, you would need to overwrite the existing list or create a new one.

l2 = l2[::-1]
print(l2)
['list', 'random', 'a']

Pop’it

We saw that the my_list.pop(i) statement removes the i-th element from the list my_list. Using the Python documentation or a Google search, determine the default behavior of this method, i.e., what happens when no parameter is given to the pop function. Verify that you observe this behavior with an example of your choice.

# Test your answer in this cell
Show the solution
l = ["do", "re", "mi"]
l.pop()
print(l)
['do', 're']

Min and Max of Different Lists

There are many more built-in methods for lists than those we have already seen. For example: min and max. Verify their behavior:

  • on a list composed solely of numeric objects (int and float);
  • on a list composed solely of strings;
  • on a list composed of a mix of numeric and textual objects.
# Test your answer in this cell
Show the solution
a = [5, 800, 9.92, 0]
b = ["do", "re", "mi", "fa", "sol"]
c = [1, "mix", "of", 2]

print(min(a), max(a))
print(min(b), max(b))
print(min(c), max(c))
0 800
do sol
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[80], line 7
      5 print(min(a), max(a))
      6 print(min(b), max(b))
----> 7 print(min(c), max(c))

TypeError: '<' not supported between instances of 'str' and 'int'

The third expression returns an error: there is no relevant order relationship.

Empty List

Try creating an empty list. Verify its type. What could be the use of such an object?

# Test your answer in this cell
Show the solution
l = []
print(l)
print(type(l))
[]
<class 'list'>

We can indeed create an empty list. But what is the use? A very common use is to initialize a list, which will then be filled as iterations of a loop progress. Loops will be the subject of a future tutorial; but here is a simple example of such use.

for i in range(10):
    l.append(i)
    
print(l)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

The list Function

In the tutorial, we saw the list and tuple functions that allow converting from one type to the other. In reality, the functionality of these functions is more subtle: the code list(my_object) returns the “list version” of that object, just as str(3) returns '3', i.e., the string version of the integer 3.

Using the list function, find the “list versions” of the following objects:

  • the tuple a = (1, 2, 3);
  • the string b = "hello";
  • the integer c = 5
# Test your answer in this cell
Show the solution
a = (1, 2, 3)
print(list(a))

b = "hello"
print(list(b))

c = 5
print(list(c))
[1, 2, 3]
['h', 'e', 'l', 'l', 'o']
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[85], line 8
      5 print(list(b))
      7 c = 5
----> 8 print(list(c))

TypeError: 'int' object is not iterable

The last expression returns an error: an integer is not a sequence, so a “list version” does not make sense. However, it is of course possible to create a list with a single element 5.

d = [5]
print(d)
[5]

Immutability of Tuples

We saw that tuples are immutable. But does this property transfer recursively? For example, is a list contained in a tuple itself immutable? Verify with an example of your choice.

# Test your answer in this cell
Show the solution
t = (1, 2, ["a", "list"])
t[2][0] = 26
print(t)
(1, 2, [26, 'list'])

Verdict: immutability only applies to the first level. It does not transfer to sub-elements.

Sequence Unpacking

Read the section on sequence packing and unpacking in the Python documentation. Unpacking is a commonly used feature in practice. Verify that it works on the various sequential objects we have seen so far (strings, lists, and tuples).

# Test your answer in this cell
Show the solution
x, y, z = "abc"
print(y)

a, b, c, d = ["do", "re", "mi", "fa"]
print(c)

r, s, t, u = ("a", "tuple", "of", "test")
print(r)
b
mi
a