type(3)
int
In this first practical session, we will discover the most fundamental objects in Python: numbers and strings. We will also see how we can assign objects to variables to perform operations with these objects.
Python offers different types of numerical objects. In this tutorial, we will focus on the two most commonly used types:
int
for integer)float
for decimal numbers)In general, we use the type
function to print the type of a Python object.
type(3)
int
type(3.14)
float
The float
and int
functions can be used to convert from one type to another.
# Convert to float
float(3)
3.0
# Convert to float
type(float(3))
float
# Convert to int
int(3.79)
3
Be cautious with float to int conversion, which truncates the decimal part.
Floats can also be written in scientific notation :
2e3
2000.0
type(2e3)
float
# Addition
8 + 9
17
# Subtraction
5 - 2
3
# Multiplication
2 * 6
12
# Division
9 / 4
2.25
# Division by 0
3 / 0
--------------------------------------------------------------------------- ZeroDivisionError Traceback (most recent call last) Cell In[78], line 2 1 # Division by 0 ----> 2 3 / 0 ZeroDivisionError: division by zero
Division by 0 produces an error. This was, of course, predictable. But it is not uncommon to have such errors in statistical calculations, especially with NumPy or Pandas, producing a similar error that needs to be debugged.
Code errors are a necessary and actually essential part of learning a language: debugging the errors our code produces helps us learn to avoid them in the future! To do this, it is necessary to understand them well in the first place.
The error above, related to division by 0, produced a Traceback
, i.e., a detailed log indicating at which step of the various operations performed by Python the error occurred, as well as the name of the error (ZeroDivisionError
) and a description (“division by zero”). In this case, the error is simple, so the message allows us to understand the problem directly. For more complicated operations, the names and messages may be less obvious… but still useful to understand the source of the error - by indicating them in a search engine, for example.
# Euclidean division: quotient
9 // 4
2
# Euclidean division: remainder
9 % 4
1
# Power
2 ** 5
32
# Square root
5 ** 0.5
2.23606797749979
# Order of operations: usual convention
2 + 5 * (10 - 4)
32
Strings are used to store textual information. More precisely, they can store any character of type Unicode, which includes letters from different languages, punctuation, numbers, emojis, etc.
A string is defined by enclosing the information in single or double quotes.
# First way
'word'
'word'
# Second way
"this also works"
'this also works'
# But be careful with mixing the two!
'it's a disaster'
Cell In[86], line 2 'it's a disaster' ^ SyntaxError: unterminated string literal (detected at line 2)
Syntax error: the second apostrophe is understood as the end of the string, and Python cannot interpret the rest of the sequence.
Since the beginning of the session, we have seen several errors produced by the code. In practice, it is important to distinguish between two types of errors:
Traceback
(error message) indicates with arrows pointing upwards the line and the moment where the problem started.Why is it important to distinguish between these two types of errors? Because while incorrect syntax necessarily generates a SyntaxError
and stops code execution, an exception can be handled in the code. For example, in a code where multiple divisions are performed with various parameters, one might want to ensure that a division by zero does not return an error that interrupts code execution but an arbitrary value (infinity, missing value…).
To avoid the syntax error, vary the characters when needed:
"it's fine"
"it's fine"
Same in the reverse:
'no problem with "quotes"'
'no problem with "quotes"'
print
functionWorking with strings is an opportunity to discover the very practical and widely used print
function. It simply displays the argument passed to it within parentheses and a newline by default.
# Display the string "me"
"me"
'me'
# Display the string "me" with print
print("me")
me
So far, we have seen that we can simply execute a cell to display the contents of a string. But does it work with multiple strings?
# Who will be displayed?
"me"
"not me"
'not me'
Here we see a characteristic behavior of Jupyter notebooks: only the last value returned in a cell is displayed. The print
function allows us to overcome this limitation.
# And this time?
print("me")
print("me too")
me
me too
The len
function counts the number of characters in a string, all characters included (letters, numbers, spaces, punctuation…).
len("I have 19 characters")
20
The “character” type does not exist in Python: a single character is defined as a string of length 1.
print(type("a"))
print(len("a"))
<class 'str'>
1
In Python, a string is a sequence, meaning a series of characters in a specific order. Therefore, each character in a string is indexed (Python knows its position), and we can use this index to extract specific characters, substrings, etc.
In Python, we use brackets []
to call the index of a sequence. More precisely, the index works as follows: x[a:b:c]
returns a substring of the string x
where a
is the starting character’s position, b
is the position of the ending character plus 1, and c
is the indexing step. This will be clearer with the following examples.
Important note: indexing starts at 0 in Python.
"a sequence that we will index"
'a sequence that we will index'
# First element
"a sequence that we will index"[0]
'a'
# Second element
"a sequence that we will index"[1]
' '
# Last element
"a sequence that we will index"[-1]
'x'
# Extract everything from a certain character
"a sequence that we will index"[4:]
'quence that we will index'
# Extract everything up to a certain character
"a sequence that we will index"[:12]
'a sequence t'
# Extract a substring
"a sequence that we will index"[4:12]
'quence t'
# Extract every 2 characters, starting from the 4th position
"a sequence that we will index"[4::2]
'qec htw ilidx'
# Reverse a sequence
"a sequence that we will index"[::-1]
'xedni lliw ew taht ecneuqes a'
To remember: it is because a string is considered a sequence by Python that we can index it. For example, indexing a number does not make sense and therefore returns an error.
2[3]
<>:1: SyntaxWarning: 'int' object is not subscriptable; perhaps you missed a comma?
<>:1: SyntaxWarning: 'int' object is not subscriptable; perhaps you missed a comma?
/tmp/ipykernel_2832/769348720.py:1: SyntaxWarning: 'int' object is not subscriptable; perhaps you missed a comma?
2[3]
/tmp/ipykernel_2832/769348720.py:1: SyntaxWarning: 'int' object is not subscriptable; perhaps you missed a comma?
2[3]
/tmp/ipykernel_2832/769348720.py:1: SyntaxWarning: 'int' object is not subscriptable; perhaps you missed a comma?
2[3]
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[104], line 1 ----> 1 2[3] TypeError: 'int' object is not subscriptable
# String concatenation
"my address is: " + "10 Maple Street"
'my address is: 10 Maple Street'
# Repetition
"echo - " * 5
'echo - echo - echo - echo - echo - '
Different Python objects generally have built-in methods that allow performing basic operations on the object.
We will see in a future chapter what methods in Python precisely consist of. For now, you can remember that methods are used in the format object.method(parameters)
where the parameters are optional.
# Convert to uppercase
"sequence 850".upper()
'SEQUENCE 850'
# Convert to lowercase
"sequence 850".lower()
'sequence 850'
# Split words by spaces
"a sequence to split".split()
['a', 'sequence', 'to', 'split']
# Split words by an arbitrary character
"useful for making sub-sequences".split("-")
['useful for making sub', 'sequences']
# Use strings as templates
"my address is: {}".format("10 Maple Street")
'my address is: 10 Maple Street'
This is just a glimpse of the countless operations possible on strings. The official documentation lists all the available built-in methods. The chapter exercises and the mini-projects at the end of the section will be an opportunity to discover other uses.
So far, we have had to define our object each time before we could apply a transformation to it. What if we want to reuse an object and apply multiple transformations to it? Or perform operations with different objects?
To do this, we will assign objects to variables.
Assignment is done in the format: variable_name = object
. This allows us to perform operations with these variables.
= 5
x x
5
type(x)
int
+ 5 x
10
= x + 2*x
y y
15
Unlike other programming languages, Python is said to be dynamically typed: it is possible to reassign a variable to an object of a different type. This makes reading and developing easier, but it can sometimes create problems that are difficult to debug… Therefore, always make sure that the type of the variable is the one you think you are handling.
= 3
x = "blabla"
x type(x)
str
Naturally, there are certain constraints on operations depending on the types of objects.
= "test"
x = 3
y + y x
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[117], line 3 1 x = "test" 2 y = 3 ----> 3 x + y TypeError: can only concatenate str (not "int") to str
However, it is possible to harmonize the types beforehand:
= "test"
x = 3
y = str(y)
z + z x
'test3'
It is common to use a variable as a counter, incrementing it each time a given event occurs, for example.
= 0
a print(a)
= a +1
a print(a)
0
1
This practice is so common that there are special operators for common arithmetic operations.
= 0
a += 1
a a
1
= 5
b *= 3
b b
15
print
function?len
function?1/ The basic types in Python seen in this tutorial are numerical types, including integers (int
) and floats (decimal numbers, float
), and strings (str
), which store textual information.
2/ To convert an integer to a float, we use the float()
function. To convert a float to an integer, we use the int()
function.
3/ A string is defined by enclosing the information in single (’) or double (“) quotes.
4/ A Traceback
is an error report that shows the sequence of operations that led to an exception. It helps identify the origin of the error in the code.
5/ A syntax error occurs when Python code does not comply with the language’s syntax rules, making the script unexecutable. An exception is an error detected during execution, even if the code has correct syntax. Exceptions can be handled in the code, whereas syntax errors necessarily lead to code execution stopping.
6/ The print
function displays the content of the argument passed to it within parentheses on the Python console or in a Jupyter notebook cell.
7/ By default, executing a Jupyter cell displays the last value returned by the code executed in that cell. If two lines of code return something, only the last one will be displayed. To display multiple elements in the same cell, we use the print
function for each operation whose result we want to display.
8/ The len
function returns the number of elements in an object. For example, the number of characters in a string. This function only makes sense for sequence-type objects.
9/ A built-in method is a function integrated into a Python object type that allows performing specific operations on that object. It is recognized because it is called directly on the object with the syntax object.method()
.
10/ Variables are used to store values or objects to reuse and manipulate them more easily in the code. They also give a name to the data to make it more readable and easier to manipulate.
11/ Python is said to be dynamically typed because the type of variables is determined at runtime and can change during execution. Thus, a variable initially defined as a string can become a numeric variable during code execution, which is impossible in programming languages based on static typing.
Display the type of x when:
# Type your answer in this cell
= 3
x print(type(x))
= "test"
x print(type(x))
= 3.5
x print(type(x))
<class 'int'>
<class 'str'>
<class 'float'>
Calculate the sum of the lengths of the following three strings:
# Type your answer in this cell
= "a first string"
a = "and a second"
b = "never two without three"
c
len(a) + len(b) + len(c)
49
What is the appropriate type to define a postal code?
Try defining the following postal codes in both int
and string
format:
What do you conclude?
# Type your answer in this cell
= 92120
cp1_int = "92120"
cp1_str
print(cp1_int, cp1_str) # No problem
= 02350 cp2_int
Cell In[127], line 6 cp2_int = 02350 ^ SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
Error: Python does not accept defining an integer that starts with a 0. Postal codes must therefore be defined as strings.
Count the number of times the letter e appears in the following string: “I am counting the e’s.”
Hint: you can use the built-in method count.
# Type your answer in this cell
= "I am counting the e's."
a 'e') a.count(
2
Find the first position where the letter e appears in the following string: “I am counting the e’s.”
Hint: you can use the built-in method find.
# Type your answer in this cell
= "I am counting the e's."
a 'e') a.find(
16
Remove the extra spaces at the beginning and end of the following string:
Hint: you can use the built-in method strip.
# Type your answer in this cell
= " A poorly formatted string. " a
= " A poorly formatted string. "
a a.strip()
'A poorly formatted string.'
The \
character allows escaping (neutralizing) a special character within a string. Find how this character solves the problem of using quotes (or apostrophes) within a string defined by quotes (apostrophes).
Hint: examples of usage are available in the official documentation.
# Type your answer in this cell
"just a \"small\" test"
'just a "small" test'
Perform the following sequence of operations using increment operators, and print the final value:
# Type your answer in this cell
= 1
a -= 5
a *= 4
a += 22
a print(a)
6
Consider the following two sequences:
Find two different ways from the tutorial to compose the sequence “we are in 2024”.
Hint: one of the two methods involves modifying (slightly) one of the two sequences.
# Type your answer in this cell
= "we are in"
a1 = "we are in {}"
a2 = "2024"
b
print(a1 + " " + b)
print(a2.format(b))
we are in 2024
we are in 2024
f-strings
f-strings
are a somewhat special but very practical form of strings, added in Python version 3.6. To understand their interest, let’s go back to the solution of the previous exercise, which illustrated two ways to compose the string “we are in 2024”.
= "we are in"
a1 = "we are in {}"
a2 = "2024"
b
print(a1 + " " + b)
print(a2.format(b))
we are in 2024
we are in 2024
These two methods work but have limitations. Think about what would happen in the following cases, and feel free to run tests to convince yourself:
format()
method (e.g., if “2024” were an integer and not a string)?Using the official documentation as inspiration, use f-strings
to solve these various problems.
# Type your answer in this cell
First problem: composing strings with numeric values.
= "we are in"
a1 = 2024
b
# print(a1 + " " + b) # Error
print(a1 + " " + str(b))
we are in 2024
Direct concatenation returns an error -> you must convert the numeric value to a string first.
Second problem: juxtaposition of multiple strings.
= "we are in"
a = "2024"
b = "and my name is"
c = "Miranda"
d
print(a + " " + b + " " + c + " " + d)
we are in 2024 and my name is Miranda
The syntax quickly becomes unreadable, as you must manually add separators (space) between each part.
Third problem: composing strings with variable injection.
= "we are in {}"
a = "2024"
b = "and my name is {}"
c = "Miranda"
d
print(a.format(b) + " " + c.format(d))
we are in 2024 and my name is Miranda
The syntax remains unreadable, as you must inject values into each string.
Solution: using f-strings.
= 2024
year = "Miranda"
name
print(f"we are in {year} and my name is {name}")
we are in 2024 and my name is Miranda