Data Structures#

What you will learn in this lesson:

  • Lists

  • Dictionaries

  • Tuples

  • Sets

  • Ranges

In contrast to primitive data types (e.g. integers, floats and booleans), data structures organize types into structures that have certain properties, such as order, mutability, and addressing scheme, e.g. by index.

Lists#

A list is an ordered sequence of items.

Each element of a list is associated with an integer that represents the order in which the element appears.

Lists are indexed with brackets [].

List elements are accessed by providing their order number in the brackets.

Lists are mutable, meaning you can modify them after they have been created.

They can contain mixed types.

Constructing a list#

They can be constructed in several ways:

list1 = [] # empty list
list2 = list(()) # Also an empty list
list3 = "some string".split()
numbers = [1,2,3,4] # a list of integers

print(list1)
print(list2)
print(list3)
print(numbers)
[]
[]
['some', 'string']
[1, 2, 3, 4]
# List can contain mixed types
myList = ['coconuts', 777, 7.25, 'Sir Robin', 80.0, True]
myList
['coconuts', 777, 7.25, 'Sir Robin', 80.0, True]

Practice exercise#

Exercise 10

Using some of the previous methods, construct a list containing your numerical birth month, the first letter of your name, & a boolean to the question: I like coffee.

# Start your answers here

Indexing#

Info: Indexing a list is similar to indexing a string.
numbers = [1,2,3,4,5,6]
numbers[0] # Access first element (output: 1)
1
numbers[0] + numbers[3] # doing arithmetic with the elements of the list (output: 5)
5
# Like with strings, we can get the number of elements of the list with the function len()
len(numbers)
6
numbers[:2] # returns 1 (index 0) to 2 (index 2 minus 1)
[1, 2]
numbers[-2:] # returns 3 (4 minus 2 = index 2) to 4 (4 minus 1 = index 3)
[5, 6]
# Returns the last element (see strings lesson)
numbers[-1:]
[6]

You can find the index of an element by using the method index

numbers.index(3)
2

Practice exercise#

Exercise 11

Apply indexing to the list in the first practice excercise to pull out your numerical birth month.

# Start your answers here

Slicing#

Info: Slicing a list is similar to indexing a strings.
numbers[0:2] # Output: [1, 2]
[1, 2]
numbers[1:3] # Output: [2, 3]
[2, 3]
numbers[2:]  # Output: [3, 4]
[3, 4, 5, 6]

Practice exercise#

Exercise 12

Slice the list in the first practice exercise using a method from above. For more see https://www.learnbyexample.org/python-list-slicing/

# Start your answers here

Operations on lists#

Info: Operators * and + work similarly as with strings
# This yields list repeated 2 times
numbers * 2
[1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]

Practice exercise#

Exercise 13

Multiply the list in the first practice exercise by a given scalar.

# This concatenates two lists
numbers2 = [30, 40, 50]
numbers + numbers2 
[1, 2, 3, 4, 5, 6, 30, 40, 50]

Practice exercise#

Exercise 14

Add “numbers” to the list in the first practice exercise.

# Start your answers here

Some methods#

The following are methods that I find particularly useful when using lists:

append(elmnt): Appends an element to the end of the list. This append operation is performed in place.

print(numbers)

# Let's add the element 10 at the end of the list
numbers.append(10)

print(numbers)
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6, 10]

We could achieve the same result using list concatenation. However, concatenation creates a new list rather than modifying the original one, so we would need to reassign the result back to the original variable in order to update it.

# This was the original 'numbers' list
numbers = [1,2,3,4,5,6]

# This is using concatenation to add the value of 10
print(numbers + [10])

# However it did not modify the original list
print(numbers)

# We need to reassign the variable to the new list 
numbers = numbers + [10]
print(numbers)
[1, 2, 3, 4, 5, 6, 10]
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6, 10]

insert(pos, elmnt): Inserts the specified value at the specified position. This method takes two arguments. The first argument is the position in the list where you want to insert the new value, and the second argunemnt the value that you want to insert.

This operation is performed in place.

numbers = [1,2,3,4,5,6]
print(numbers)

numbers.insert(1, 10)

print(numbers)
[1, 2, 3, 4, 5, 6]
[1, 10, 2, 3, 4, 5, 6]

Notes#

Alert: We can not directly multiply lists.
# This is not allowed
myList * myList
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[23], line 2
      1 # This is not allowed
----> 2 myList * myList

TypeError: can't multiply sequence by non-int of type 'list'
Info: Lists can be nested.
names = ['Darrell', 'Clayton', ['Billie', 'Arthur'], 'Samantha']
print(names[2]) # returns a *list*
print(names[0]) # returns a *string*
['Billie', 'Arthur']
Darrell

Mutability#

Lists are mutable!

print(f"(Before) First name in names: {names[0]}")

names[0] = "Clint"

print(f"(After) First name in names: {names[0]}")
(Before) First name in names: Darrell
(After) First name in names: Clint

Dictionaries#

Dictionaries are like hash tables, containing key-value pairs.

Elements are indexed using brackets [] (like lists).

But Dictionaries are constructed using braces {} or dict().

Key names must be unique. If you re-use a key, you overwrite its value.

Keys don’t have to be strings – they can be numbers or tuples or expressions that evaluate to one of these.

Constructing a dictionary#

dictionary1 = {
    'a': 1,
    'b': 2,
    'c': 3
}
dictionary2 = dict(x=55, y=29, z=99) # Note the absence of quotes around keys
dictionary2
{'x': 55, 'y': 29, 'z': 99}
dictionary3 = {'A': 'foo', 
               99: 'bar', # Note that the key now is number 
               (1,2): 'baz' # Note that the key now is now a tuple (see below)
              }
dictionary3
{'A': 'foo', 99: 'bar', (1, 2): 'baz'}

Retrieve a value#

Just pass the key as the index in the brackets.

phonelist = {'Tom':123, 'Bob':456, 'Sam':897}
phonelist['Bob']
456

or use the method get:

phonelist.get('Bob')
456

Adding a new entry#

We can always add new entries to a dictionary by assigning a value to a new key.

# We create a new key-value mapping
phonelist["John"] = 332
phonelist
{'Tom': 123, 'Bob': 456, 'Sam': 897, 'John': 332}

Some methods#

keys: Provides the keys of a dictionary. Keys are not sorted. They print in the order entered.

phonelist.keys() # Returns a list
dict_keys(['Tom', 'Bob', 'Sam'])

values: Provides the values of a dictionary.

phonelist.values() # Returns a list
dict_values([123, 456, 897])

items: Provides both the keys and values of a dictionary.

phonelist.items() # Returns a list of tuples
dict_items([('Tom', 123), ('Bob', 456), ('Sam', 897)])

These methods are handy when using For loops (next week).

for key in sorted(phonelist.keys()):
    print(key)
Bob
Sam
Tom

update: Inserts a specified item to the dictionary. The specified item is usually a dictionary.

phonelist.update({"Sarah": 223})
phonelist
{'Tom': 123, 'Bob': 456, 'Sam': 897, 'John': 332, 'Sarah': 223}
Info: We can use len and in with Dictionaries.
len(phonelist)
3
# If we use the name of the list with `in` in checks the keys
"Bob" in phonelist
True
# Therefore, if we check a particular value in this way it will return a False
123 in phonelist
False
# We have to specify that we want to check the values of the dictionary
123 in phonelist.values()
True

Tuples#

A tuple is like a list but with one big difference: a tuple is an immutable object! That is, you can’t change a tuple once it’s created.

A tuple can contain any number of elements of any datatype.

Accessed with brackets [] but constructed with or without parentheses ().

# We saw lists are mutable

print(numbers)
numbers[3] = 30
print(numbers)
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 30, 5, 6]
# But, tuples are immutable
numbers4 = (26, 27, 28)
numbers4[2] = 30
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[40], line 3
      1 # But, tuples are immutable
      2 numbers4 = (26, 27, 28)
----> 3 numbers4[2] = 30

TypeError: 'tuple' object does not support item assignment

Constructing#

Created with comma-separated values, with or without parentheses.

letters = 'a', 'b', 'c', 'd'
letters
('a', 'b', 'c', 'd')
type(letters)
tuple
numbers = (1,2,3,4) # numbers 1,2,3,4 stored in a tuple
numbers
(1, 2, 3, 4)
type(numbers)
tuple

A single valued tuple must include a comma ,, e.g.

# If we miss the comma with a single valued tuple, it is not a tuple
tuple0 = (29)
type(tuple0)
int
# comma included
tuple1 = (29,)

type(tuple1)
tuple
Info: Indexing and slicing is similar to lists and strings.
numbers[2]
3
numbers[:3]
(1, 2, 3)
Info: We can also use len and in with Tuples.
len(numbers)
4
10 in numbers
False

Sets#

A set is an unordered collection of unique objects.

They are subject to set operations.

peanuts = {'snoopy','snoopy','woodstock'}
# Sets only keeps unique objects
peanuts
{'snoopy', 'woodstock'}

Since sets are unordered, they don’t have an index. This will break:

peanuts[0]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[53], line 1
----> 1 peanuts[0]

TypeError: 'set' object is not subscriptable
for peanut in peanuts:
    print(peanut)
snoopy
woodstock
Alert: We can use len and in with Sets.
len(peanuts)
2
'snoopy' in peanuts
True

Some useful methods#

  • union: To combine two sets!

set1 = {'python','R'}
set2 = {'R','SQL'}
set1.union(set2)
{'R', 'SQL', 'python'}
Alert: Sets can not be combined with the `+` operator as they are unordered.
set1 + set2
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[59], line 1
----> 1 set1 + set2

TypeError: unsupported operand type(s) for +: 'set' and 'set'
  • intersection: Get the set intersection:

set1.intersection(set2)
{'R'}

For this operation you can also use the symbol &

set1 & set2
{'R'}
  • difference: It returns the different elements of the left-hand set with respect to the right-hand one

set1.difference(set2)
{'python'}
set2.difference(set1)
{'SQL'}

Ranges#

A range is a sequence of integers, from start to stop by step.

  • The start point is zero by default.

  • The stop point is NOT included. Like when slicing strings and lists, the last index of the sequence corresponds to stop-1.

  • The step is one by default.

Ranges can be assigned to a variable.

rng = range(5)
rng
range(0, 5)

More often, ranges are used in iterations (e.g. For loops), which we will cover later.

for rn in rng:
    print(rn)
0
1
2
3
4

another range:

rangy = range(1, 11, 2)
for rn in rangy:
    print(rn)
1
3
5
7
9

Summary#

Throughout this lesson, we have mentioned what operations, functions and features each data structure is characterized by. Here is compact summary:

Data Structure

Supports Indexing?

Supports Slicing?

len()

in

Concatenation (+)

Multiplication (*)

Mutable?

Strings

Yes

Yes

Yes

Yes

Yes

Yes

No

Lists

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Tuples

Yes

Yes

Yes

Yes

Yes

Yes

No

Sets

No

No

Yes

Yes

No

No

Yes

Dictionaries

No

No

Yes

Yes

No

No

Yes

Ranges

Yes

Yes

Yes

Yes

No

No

No

Practice exercise#

Exercise 15

List one important attribute of list, dictionary, and tuple

# Start your answers here