Data Structures#
What you will learn in this lesson:
Lists
Dictionaries
Tuples
Sets
Ranges
In contrast to primitive data types (e.g. integers, floats and booleans), data structures organize types into structures that have certain properties, such as order, mutability, and addressing scheme, e.g. by index.
Lists#
A list is an ordered sequence of items.
Each element of a list is associated with an integer that represents the order in which the element appears.
Lists are indexed with brackets []
.
List elements are accessed by providing their order number in the brackets.
Lists are mutable, meaning you can modify them after they have been created.
They can contain mixed types.
Constructing a list#
They can be constructed in several ways:
list1 = [] # empty list
list2 = list(()) # Also an empty list
list3 = "some string".split()
numbers = [1,2,3,4] # a list of integers
print(list1)
print(list2)
print(list3)
print(numbers)
[]
[]
['some', 'string']
[1, 2, 3, 4]
# List can contain mixed types
myList = ['coconuts', 777, 7.25, 'Sir Robin', 80.0, True]
myList
['coconuts', 777, 7.25, 'Sir Robin', 80.0, True]
Practice exercise#
Using some of the previous methods, construct a list containing your numerical birth month, the first letter of your name, & a boolean to the question: I like coffee.
# Start your answers here
Indexing#
numbers = [1,2,3,4,5,6]
numbers[0] # Access first element (output: 1)
1
numbers[0] + numbers[3] # doing arithmetic with the elements of the list (output: 5)
5
# Like with strings, we can get the number of elements of the list with the function len()
len(numbers)
6
numbers[:2] # returns 1 (index 0) to 2 (index 2 minus 1)
[1, 2]
numbers[-2:] # returns 3 (4 minus 2 = index 2) to 4 (4 minus 1 = index 3)
[5, 6]
# Returns the last element (see strings lesson)
numbers[-1:]
[6]
You can find the index of an element by using the method index
numbers.index(3)
2
Practice exercise#
Apply indexing to the list in the first practice excercise to pull out your numerical birth month.
# Start your answers here
Slicing#
numbers[0:2] # Output: [1, 2]
[1, 2]
numbers[1:3] # Output: [2, 3]
[2, 3]
numbers[2:] # Output: [3, 4]
[3, 4, 5, 6]
Practice exercise#
Slice the list in the first practice exercise using a method from above. For more see https://www.learnbyexample.org/python-list-slicing/
# Start your answers here
Operations on lists#
*
and +
work similarly as with strings
# This yields list repeated 2 times
numbers * 2
[1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]
Practice exercise#
Multiply the list in the first practice exercise by a given scalar.
# This concatenates two lists
numbers2 = [30, 40, 50]
numbers + numbers2
[1, 2, 3, 4, 5, 6, 30, 40, 50]
Practice exercise#
Add “numbers” to the list in the first practice exercise.
# Start your answers here
Some methods#
The following are methods that I find particularly useful when using lists:
append(elmnt)
: Appends an element to the end of the list. This append operation is performed in place.
print(numbers)
# Let's add the element 10 at the end of the list
numbers.append(10)
print(numbers)
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6, 10]
We could achieve the same result using list concatenation. However, concatenation creates a new list rather than modifying the original one, so we would need to reassign the result back to the original variable in order to update it.
# This was the original 'numbers' list
numbers = [1,2,3,4,5,6]
# This is using concatenation to add the value of 10
print(numbers + [10])
# However it did not modify the original list
print(numbers)
# We need to reassign the variable to the new list
numbers = numbers + [10]
print(numbers)
[1, 2, 3, 4, 5, 6, 10]
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6, 10]
insert(pos, elmnt)
: Inserts the specified value at the specified position. This method takes two arguments. The first argument is the position in the list where you want to insert the new value, and the second argunemnt the value that you want to insert.
This operation is performed in place.
numbers = [1,2,3,4,5,6]
print(numbers)
numbers.insert(1, 10)
print(numbers)
[1, 2, 3, 4, 5, 6]
[1, 10, 2, 3, 4, 5, 6]
Notes#
# This is not allowed
myList * myList
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[23], line 2
1 # This is not allowed
----> 2 myList * myList
TypeError: can't multiply sequence by non-int of type 'list'
names = ['Darrell', 'Clayton', ['Billie', 'Arthur'], 'Samantha']
print(names[2]) # returns a *list*
print(names[0]) # returns a *string*
['Billie', 'Arthur']
Darrell
Mutability#
Lists are mutable!
print(f"(Before) First name in names: {names[0]}")
names[0] = "Clint"
print(f"(After) First name in names: {names[0]}")
(Before) First name in names: Darrell
(After) First name in names: Clint
Dictionaries#
Dictionaries are like hash tables, containing key-value pairs.
Elements are indexed using brackets []
(like lists).
But Dictionaries are constructed using braces {}
or dict()
.
Key names must be unique. If you re-use a key, you overwrite its value.
Keys don’t have to be strings – they can be numbers or tuples or expressions that evaluate to one of these.
Constructing a dictionary#
dictionary1 = {
'a': 1,
'b': 2,
'c': 3
}
dictionary2 = dict(x=55, y=29, z=99) # Note the absence of quotes around keys
dictionary2
{'x': 55, 'y': 29, 'z': 99}
dictionary3 = {'A': 'foo',
99: 'bar', # Note that the key now is number
(1,2): 'baz' # Note that the key now is now a tuple (see below)
}
dictionary3
{'A': 'foo', 99: 'bar', (1, 2): 'baz'}
Retrieve a value#
Just pass the key as the index in the brackets.
phonelist = {'Tom':123, 'Bob':456, 'Sam':897}
phonelist['Bob']
456
or use the method get
:
phonelist.get('Bob')
456
Adding a new entry#
We can always add new entries to a dictionary by assigning a value to a new key.
# We create a new key-value mapping
phonelist["John"] = 332
phonelist
{'Tom': 123, 'Bob': 456, 'Sam': 897, 'John': 332}
Some methods#
keys
: Provides the keys of a dictionary. Keys are not sorted. They print in the order entered.
phonelist.keys() # Returns a list
dict_keys(['Tom', 'Bob', 'Sam'])
values
: Provides the values of a dictionary.
phonelist.values() # Returns a list
dict_values([123, 456, 897])
items
: Provides both the keys and values of a dictionary.
phonelist.items() # Returns a list of tuples
dict_items([('Tom', 123), ('Bob', 456), ('Sam', 897)])
These methods are handy when using For loops (next week).
for key in sorted(phonelist.keys()):
print(key)
Bob
Sam
Tom
update
: Inserts a specified item to the dictionary. The specified item is usually a dictionary.
phonelist.update({"Sarah": 223})
phonelist
{'Tom': 123, 'Bob': 456, 'Sam': 897, 'John': 332, 'Sarah': 223}
len
and in
with Dictionaries.
len(phonelist)
3
# If we use the name of the list with `in` in checks the keys
"Bob" in phonelist
True
# Therefore, if we check a particular value in this way it will return a False
123 in phonelist
False
# We have to specify that we want to check the values of the dictionary
123 in phonelist.values()
True
Tuples#
A tuple is like a list but with one big difference: a tuple is an immutable object! That is, you can’t change a tuple once it’s created.
A tuple can contain any number of elements of any datatype.
Accessed with brackets []
but constructed with or without parentheses ()
.
# We saw lists are mutable
print(numbers)
numbers[3] = 30
print(numbers)
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 30, 5, 6]
# But, tuples are immutable
numbers4 = (26, 27, 28)
numbers4[2] = 30
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[40], line 3
1 # But, tuples are immutable
2 numbers4 = (26, 27, 28)
----> 3 numbers4[2] = 30
TypeError: 'tuple' object does not support item assignment
Constructing#
Created with comma-separated values, with or without parentheses.
letters = 'a', 'b', 'c', 'd'
letters
('a', 'b', 'c', 'd')
type(letters)
tuple
numbers = (1,2,3,4) # numbers 1,2,3,4 stored in a tuple
numbers
(1, 2, 3, 4)
type(numbers)
tuple
A single valued tuple must include a comma ,
, e.g.
# If we miss the comma with a single valued tuple, it is not a tuple
tuple0 = (29)
type(tuple0)
int
# comma included
tuple1 = (29,)
type(tuple1)
tuple
numbers[2]
3
numbers[:3]
(1, 2, 3)
len
and in
with Tuples.
len(numbers)
4
10 in numbers
False
Sets#
A set
is an unordered collection of unique objects.
They are subject to set operations.
peanuts = {'snoopy','snoopy','woodstock'}
# Sets only keeps unique objects
peanuts
{'snoopy', 'woodstock'}
Since sets are unordered, they don’t have an index. This will break:
peanuts[0]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[53], line 1
----> 1 peanuts[0]
TypeError: 'set' object is not subscriptable
for peanut in peanuts:
print(peanut)
snoopy
woodstock
len
and in
with Sets.
len(peanuts)
2
'snoopy' in peanuts
True
Some useful methods#
union
: To combine two sets!
set1 = {'python','R'}
set2 = {'R','SQL'}
set1.union(set2)
{'R', 'SQL', 'python'}
set1 + set2
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[59], line 1
----> 1 set1 + set2
TypeError: unsupported operand type(s) for +: 'set' and 'set'
intersection
: Get the set intersection:
set1.intersection(set2)
{'R'}
For this operation you can also use the symbol &
set1 & set2
{'R'}
difference
: It returns the different elements of the left-hand set with respect to the right-hand one
set1.difference(set2)
{'python'}
set2.difference(set1)
{'SQL'}
Ranges#
A range is a sequence of integers, from start
to stop
by step
.
The
start
point is zero by default.The
stop
point is NOT included. Like when slicing strings and lists, the last index of the sequence corresponds to stop-1.The
step
is one by default.
Ranges can be assigned to a variable.
rng = range(5)
rng
range(0, 5)
More often, ranges are used in iterations (e.g. For loops), which we will cover later.
for rn in rng:
print(rn)
0
1
2
3
4
another range:
rangy = range(1, 11, 2)
for rn in rangy:
print(rn)
1
3
5
7
9
Summary#
Throughout this lesson, we have mentioned what operations, functions and features each data structure is characterized by. Here is compact summary:
Data Structure |
Supports Indexing? |
Supports Slicing? |
|
|
Concatenation ( |
Multiplication ( |
Mutable? |
---|---|---|---|---|---|---|---|
Strings |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
No |
Lists |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Tuples |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
No |
Sets |
No |
No |
Yes |
Yes |
No |
No |
Yes |
Dictionaries |
No |
No |
Yes |
Yes |
No |
No |
Yes |
Ranges |
Yes |
Yes |
Yes |
Yes |
No |
No |
No |
Practice exercise#
List one important attribute of list, dictionary, and tuple
# Start your answers here