Data Structures
By completing this module you will be able to:
- Work with lists and tuples
- Use sets for set theory operations
- Create and manipulate dictionaries
- Implement stacks and queues
- Choose the right data structure for each problem
Why Do We Need Different Structures?
Imagine you’re organizing your kitchen. You wouldn’t store everything in the same drawer, would you? Utensils go in an organizer with divisions, spices on a shelf where you can see the labels, fresh food in the fridge… Each type of container is designed for a specific purpose.
In programming, it’s exactly the same. The data we handle has different characteristics and needs:
- Sometimes you need to maintain a specific order (like the steps in a recipe).
- Other times you want to find something quickly by its name (like in a phone book).
- Sometimes you only care about whether something exists or not, without duplicates (like a guest list).
Python offers different “containers” for each situation. Choosing the right one will make your code more efficient and easier to understand. Let’s get to know them one by one.
Lists
Think of a shopping list or your favorite music playlist. Both have something in common: order matters and you can modify them (add songs, cross off products, reorder…).
In Python, a list works exactly like that: it’s an ordered collection where you can add, remove, or modify elements whenever you want.
1# Create lists
2numbers = [1, 2, 3, 4, 5]
3fruits = ["apple", "banana", "cherry"]
4mixed = [1, "two", 3.0, True, None]
5empty = []Accessing Elements
One of the advantages of lists is that you can access any element by its position. It’s like saying “give me song number 3 from my playlist.”
1fruits = ["apple", "banana", "cherry", "peach"]
2
3# Positive indices (from the start)
4print(fruits[0]) # apple
5print(fruits[1]) # banana
6
7# Negative indices (from the end)
8print(fruits[-1]) # peach (last)
9print(fruits[-2]) # cherry (second to last)
10
11# Slicing (sublists)
12print(fruits[1:3]) # ['banana', 'cherry']
13print(fruits[:2]) # ['apple', 'banana']
14print(fruits[2:]) # ['cherry', 'peach']
15print(fruits[::2]) # ['apple', 'cherry'] (skip by 2)Useful List Methods
Lists come with a bunch of built-in tools. Think of them like the controls on your music player: you can add songs, remove them, search for a specific one, or reorder the list.
1my_list = [1, 2, 3]
2
3# append: add to the end
4my_list.append(4)
5print(my_list) # [1, 2, 3, 4]
6
7# insert: add at specific position
8my_list.insert(0, 0)
9print(my_list) # [0, 1, 2, 3, 4]
10
11# extend: add multiple elements
12my_list.extend([5, 6])
13print(my_list) # [0, 1, 2, 3, 4, 5, 6] 1my_list = [1, 2, 3, 4, 5, 3]
2
3# remove: removes first occurrence of value
4my_list.remove(3)
5print(my_list) # [1, 2, 4, 5, 3]
6
7# pop: removes and returns element by index
8last = my_list.pop() # no argument: last
9print(last) # 3
10print(my_list) # [1, 2, 4, 5]
11
12first = my_list.pop(0) # index 0: first
13print(first) # 1
14
15# clear: removes all elements
16my_list.clear()
17print(my_list) # [] 1my_list = [10, 20, 30, 40, 30]
2
3# index: finds the position of element
4pos = my_list.index(30)
5print(pos) # 2 (first occurrence)
6
7# count: counts occurrences
8times = my_list.count(30)
9print(times) # 2
10
11# in: checks if exists
12print(20 in my_list) # True
13print(50 in my_list) # False 1numbers = [3, 1, 4, 1, 5, 9, 2, 6]
2
3# sort: sorts the list (modifies original)
4numbers.sort()
5print(numbers) # [1, 1, 2, 3, 4, 5, 6, 9]
6
7# reverse: reverses order
8numbers.reverse()
9print(numbers) # [9, 6, 5, 4, 3, 2, 1, 1]
10
11# sorted: returns a NEW sorted list
12original = [3, 1, 4, 1, 5]
13sorted_list = sorted(original)
14print(original) # [3, 1, 4, 1, 5] (unchanged)
15print(sorted_list) # [1, 1, 3, 4, 5]List Comprehensions
Here comes one of Python’s superpowers. Imagine you want to create a list with the squares of numbers from 0 to 9. You could do it with a for loop, but Python offers you a much more elegant and concise way: list comprehensions.
It’s like telling Python: “Give me a list with X for each element in Y.” Let’s see the difference:
1# Traditional way
2squares = []
3for x in range(10):
4 squares.append(x ** 2)
5
6# List comprehension (equivalent)
7squares = [x ** 2 for x in range(10)]
8print(squares) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
9
10# With condition
11evens = [x for x in range(20) if x % 2 == 0]
12print(evens) # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
13
14# Transformation
15words = ["hello", "world", "python"]
16uppercase = [w.upper() for w in words]
17print(uppercase) # ['HELLO', 'WORLD', 'PYTHON']Tuples
Imagine you’re storing the GPS coordinates of your house: latitude 40.4168, longitude -3.7038. These numbers represent a fixed location. It wouldn’t make sense to “accidentally” change the latitude while your program runs, right?
That’s what tuples exist for: they’re like lists, but once created, they cannot be modified. This feature, far from being a limitation, is an advantage: it guarantees that certain data will remain intact throughout your program’s execution.
1# Create tuples
2coordinates = (10, 20)
3colors = ("red", "green", "blue")
4mixed = (1, "two", 3.0)
5single_element = (42,) # The comma is important
6
7# Tuples are immutable
8# coordinates[0] = 5 # Error: TypeError
9
10# Unpacking
11x, y = coordinates
12print(f"x={x}, y={y}") # x=10, y=20
13
14# Useful for swapping values
15a, b = 1, 2
16a, b = b, a
17print(a, b) # 2 1- Use lists when you need to modify elements
- Use tuples for data that shouldn’t change (coordinates, RGB colors, etc.)
- Tuples are slightly more efficient in memory and speed
Sets
Imagine you’re organizing a party and making a guest list. If someone tells you “invite Maria” twice, you won’t write her down twice, will you? You only need to know that Maria is invited, period.
Sets work exactly like that: they’re collections where there can’t be duplicates. Also, they don’t care about order (does it matter if Maria is first or last on the guest list?).
But what’s really powerful about sets are the operations between them. Imagine you have the list of students studying Python and the list of those studying Java. With sets, you can easily answer: Who studies both? Who studies only Python?
1# Create sets
2numbers = {1, 2, 3, 4, 5}
3letters = set("hello") # {'h', 'e', 'l', 'o'} - no duplicates
4empty = set() # {} would create an empty dictionary
5
6# Sets have no order and no duplicates
7numbers = {3, 1, 4, 1, 5, 9, 2, 6, 5}
8print(numbers) # {1, 2, 3, 4, 5, 6, 9} - arbitrary order, no duplicatesSet Operations
This is where sets shine. Following the students example:
1python = {"Ana", "Luis", "Maria", "Carlos"}
2java = {"Luis", "Pedro", "Maria", "Juan"}
3
4# Union: elements in either set
5print(python | java)
6# {'Ana', 'Luis', 'Maria', 'Carlos', 'Pedro', 'Juan'}
7
8# Intersection: elements in both
9print(python & java)
10# {'Luis', 'Maria'}
11
12# Difference: in python but not in java
13print(python - java)
14# {'Ana', 'Carlos'}
15
16# Symmetric difference: in one or the other, but not both
17print(python ^ java)
18# {'Ana', 'Carlos', 'Pedro', 'Juan'}Set Methods
1s = {1, 2, 3}
2
3# add: add one element
4s.add(4)
5print(s) # {1, 2, 3, 4}
6
7# remove: remove (error if not exists)
8s.remove(2)
9print(s) # {1, 3, 4}
10
11# discard: remove (no error if not exists)
12s.discard(10) # Does nothing
13
14# Check membership (very fast)
15print(3 in s) # True1list_with_duplicates = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
2without_duplicates = list(set(list_with_duplicates))
3print(without_duplicates) # [1, 2, 3, 4]Dictionaries
Think of your phone’s contact list. When you want to call someone, you don’t scroll through all contacts one by one until you find them. You simply search by name and there’s the number! The name is the key and the phone number is the value.
Python dictionaries work exactly like that: they store key → value pairs. This makes them incredibly fast for finding information: whether you have 10 or 10 million contacts, finding one by name is practically instant.
1# Create dictionaries
2person = {
3 "name": "Ana",
4 "age": 25,
5 "city": "Madrid"
6}
7
8# Access values
9print(person["name"]) # Ana
10print(person.get("age")) # 25
11print(person.get("country", "Unknown")) # Unknown (default value)
12
13# Modify values
14person["age"] = 26
15person["country"] = "Spain" # Add new key
16
17# Delete
18del person["city"]
19value = person.pop("country") # Remove and returnIterating Over Dictionaries
Sometimes you need to go through all elements in a dictionary. Python gives you several ways to do it depending on what you need:
1person = {"name": "Ana", "age": 25, "city": "Madrid"}
2
3# Keys only
4for key in person:
5 print(key)
6
7# Values only
8for value in person.values():
9 print(value)
10
11# Keys and values
12for key, value in person.items():
13 print(f"{key}: {value}")Dictionary Comprehensions
1# Create dictionary of squares
2squares = {x: x**2 for x in range(6)}
3print(squares) # {0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
4
5# Invert a dictionary
6original = {"a": 1, "b": 2, "c": 3}
7inverted = {v: k for k, v in original.items()}
8print(inverted) # {1: 'a', 2: 'b', 3: 'c'}Stacks and Queues
These two structures are fundamental in programming and we use them constantly without realizing it. The difference between them is simply the order in which elements come out.
Stacks - LIFO
Imagine a stack of freshly washed plates. When you put a clean plate, you place it on top of the stack. When you need one, you take the one on top (the last one you put). This is called LIFO: Last In, First Out.
graph LR
subgraph Stack["Stack of elements"]
direction TB
A["third - top (last in)"]
B["second"]
C["first - bottom"]
end
E["append() adds to top"] --> A
A --> D["pop() returns 'third'"]Where are stacks used in real life? The “back” button in your browser is a stack: each page you visit is stacked, and when you press back, you pop the last one.
1# Use a list as a stack
2stack = []
3
4# push: add to top
5stack.append("first")
6stack.append("second")
7stack.append("third")
8print(stack) # ['first', 'second', 'third']
9
10# pop: remove from top
11element = stack.pop()
12print(element) # third
13print(stack) # ['first', 'second']
14
15# peek: view top without removing
16if stack:
17 top = stack[-1]
18 print(top) # secondQueues - FIFO
Now imagine the checkout line at the supermarket. The first person to arrive is the first to pay and leave. This is FIFO: First In, First Out.
graph LR
subgraph Queue["Queue of elements"]
C["third"] --> B["second"] --> A["first"]
end
A --> D["exits first"]
E["new entry"] --> CQueues are used everywhere: the print queue (documents print in the order you sent them), requests to a web server, messages in a chat app…
1from collections import deque
2
3# Create an efficient queue
4queue = deque()
5
6# enqueue: add to back
7queue.append("first")
8queue.append("second")
9queue.append("third")
10print(queue) # deque(['first', 'second', 'third'])
11
12# dequeue: remove from front
13element = queue.popleft()
14print(element) # first
15print(queue) # deque(['second', 'third'])Although we could use list.pop(0) for queues, deque is much more efficient. pop(0) has O(n) complexity because it must move all elements, while popleft() is O(1).
Which Structure to Choose?
After seeing so many options, it’s normal to wonder: “When do I use each one?” Here’s a quick guide:
| If you need… | Use… | Example |
|---|---|---|
| Ordered collection you can modify | List | Playlist, to-do list |
| Data that shouldn’t change | Tuple | Coordinates, dates, RGB colors |
| Unique elements, no duplicates | Set | Guest list, tags |
| Look up values by a key | Dictionary | Contacts, settings, cache |
| Last in, first out | Stack (list) | Browser history, undo |
| First in, first out | Queue (deque) | Print queue, messages |
When in doubt, start with a list. It’s the most versatile structure. Only switch to another when you have a specific reason: you need keys (dictionary), guarantee uniqueness (set), or immutability (tuple).
Practical Exercises
Given a list, remove duplicates but keep the original order of appearance.
1input_list = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
2# Expected output: [3, 1, 4, 5, 9, 2, 6]Given a text, count how many times each word appears.
1text = "the cat and the dog and the bird"
2# Expected output: {'the': 3, 'cat': 1, 'and': 2, 'dog': 1, 'bird': 1}Use a stack to verify if parentheses are balanced.
1"((()))" # True
2"(()" # False
3")(" # False