Data Structures

Module Objectives

By completing this module you will be able to:

  • Work with lists and tuples
  • Use sets for set theory operations
  • Create and manipulate dictionaries
  • Implement stacks and queues
  • Choose the right data structure for each problem

Why Do We Need Different Structures?

Imagine you’re organizing your kitchen. You wouldn’t store everything in the same drawer, would you? Utensils go in an organizer with divisions, spices on a shelf where you can see the labels, fresh food in the fridge… Each type of container is designed for a specific purpose.

In programming, it’s exactly the same. The data we handle has different characteristics and needs:

  • Sometimes you need to maintain a specific order (like the steps in a recipe).
  • Other times you want to find something quickly by its name (like in a phone book).
  • Sometimes you only care about whether something exists or not, without duplicates (like a guest list).

Python offers different “containers” for each situation. Choosing the right one will make your code more efficient and easier to understand. Let’s get to know them one by one.


Lists

Think of a shopping list or your favorite music playlist. Both have something in common: order matters and you can modify them (add songs, cross off products, reorder…).

In Python, a list works exactly like that: it’s an ordered collection where you can add, remove, or modify elements whenever you want.

1# Create lists
2numbers = [1, 2, 3, 4, 5]
3fruits = ["apple", "banana", "cherry"]
4mixed = [1, "two", 3.0, True, None]
5empty = []

Accessing Elements

One of the advantages of lists is that you can access any element by its position. It’s like saying “give me song number 3 from my playlist.”

 1fruits = ["apple", "banana", "cherry", "peach"]
 2
 3# Positive indices (from the start)
 4print(fruits[0])   # apple
 5print(fruits[1])   # banana
 6
 7# Negative indices (from the end)
 8print(fruits[-1])  # peach (last)
 9print(fruits[-2])  # cherry (second to last)
10
11# Slicing (sublists)
12print(fruits[1:3])    # ['banana', 'cherry']
13print(fruits[:2])     # ['apple', 'banana']
14print(fruits[2:])     # ['cherry', 'peach']
15print(fruits[::2])    # ['apple', 'cherry'] (skip by 2)

Useful List Methods

Lists come with a bunch of built-in tools. Think of them like the controls on your music player: you can add songs, remove them, search for a specific one, or reorder the list.

 1my_list = [1, 2, 3]
 2
 3# append: add to the end
 4my_list.append(4)
 5print(my_list)  # [1, 2, 3, 4]
 6
 7# insert: add at specific position
 8my_list.insert(0, 0)
 9print(my_list)  # [0, 1, 2, 3, 4]
10
11# extend: add multiple elements
12my_list.extend([5, 6])
13print(my_list)  # [0, 1, 2, 3, 4, 5, 6]
 1my_list = [1, 2, 3, 4, 5, 3]
 2
 3# remove: removes first occurrence of value
 4my_list.remove(3)
 5print(my_list)  # [1, 2, 4, 5, 3]
 6
 7# pop: removes and returns element by index
 8last = my_list.pop()      # no argument: last
 9print(last)               # 3
10print(my_list)            # [1, 2, 4, 5]
11
12first = my_list.pop(0)    # index 0: first
13print(first)              # 1
14
15# clear: removes all elements
16my_list.clear()
17print(my_list)  # []
 1my_list = [10, 20, 30, 40, 30]
 2
 3# index: finds the position of element
 4pos = my_list.index(30)
 5print(pos)  # 2 (first occurrence)
 6
 7# count: counts occurrences
 8times = my_list.count(30)
 9print(times)  # 2
10
11# in: checks if exists
12print(20 in my_list)   # True
13print(50 in my_list)   # False
 1numbers = [3, 1, 4, 1, 5, 9, 2, 6]
 2
 3# sort: sorts the list (modifies original)
 4numbers.sort()
 5print(numbers)  # [1, 1, 2, 3, 4, 5, 6, 9]
 6
 7# reverse: reverses order
 8numbers.reverse()
 9print(numbers)  # [9, 6, 5, 4, 3, 2, 1, 1]
10
11# sorted: returns a NEW sorted list
12original = [3, 1, 4, 1, 5]
13sorted_list = sorted(original)
14print(original)     # [3, 1, 4, 1, 5] (unchanged)
15print(sorted_list)  # [1, 1, 3, 4, 5]

List Comprehensions

Here comes one of Python’s superpowers. Imagine you want to create a list with the squares of numbers from 0 to 9. You could do it with a for loop, but Python offers you a much more elegant and concise way: list comprehensions.

It’s like telling Python: “Give me a list with X for each element in Y.” Let’s see the difference:

 1# Traditional way
 2squares = []
 3for x in range(10):
 4    squares.append(x ** 2)
 5
 6# List comprehension (equivalent)
 7squares = [x ** 2 for x in range(10)]
 8print(squares)  # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
 9
10# With condition
11evens = [x for x in range(20) if x % 2 == 0]
12print(evens)  # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
13
14# Transformation
15words = ["hello", "world", "python"]
16uppercase = [w.upper() for w in words]
17print(uppercase)  # ['HELLO', 'WORLD', 'PYTHON']

Tuples

Imagine you’re storing the GPS coordinates of your house: latitude 40.4168, longitude -3.7038. These numbers represent a fixed location. It wouldn’t make sense to “accidentally” change the latitude while your program runs, right?

That’s what tuples exist for: they’re like lists, but once created, they cannot be modified. This feature, far from being a limitation, is an advantage: it guarantees that certain data will remain intact throughout your program’s execution.

 1# Create tuples
 2coordinates = (10, 20)
 3colors = ("red", "green", "blue")
 4mixed = (1, "two", 3.0)
 5single_element = (42,)  # The comma is important
 6
 7# Tuples are immutable
 8# coordinates[0] = 5  # Error: TypeError
 9
10# Unpacking
11x, y = coordinates
12print(f"x={x}, y={y}")  # x=10, y=20
13
14# Useful for swapping values
15a, b = 1, 2
16a, b = b, a
17print(a, b)  # 2 1
When to use lists vs tuples?
  • Use lists when you need to modify elements
  • Use tuples for data that shouldn’t change (coordinates, RGB colors, etc.)
  • Tuples are slightly more efficient in memory and speed

Sets

Imagine you’re organizing a party and making a guest list. If someone tells you “invite Maria” twice, you won’t write her down twice, will you? You only need to know that Maria is invited, period.

Sets work exactly like that: they’re collections where there can’t be duplicates. Also, they don’t care about order (does it matter if Maria is first or last on the guest list?).

But what’s really powerful about sets are the operations between them. Imagine you have the list of students studying Python and the list of those studying Java. With sets, you can easily answer: Who studies both? Who studies only Python?

1# Create sets
2numbers = {1, 2, 3, 4, 5}
3letters = set("hello")  # {'h', 'e', 'l', 'o'} - no duplicates
4empty = set()  # {} would create an empty dictionary
5
6# Sets have no order and no duplicates
7numbers = {3, 1, 4, 1, 5, 9, 2, 6, 5}
8print(numbers)  # {1, 2, 3, 4, 5, 6, 9} - arbitrary order, no duplicates

Set Operations

This is where sets shine. Following the students example:

 1python = {"Ana", "Luis", "Maria", "Carlos"}
 2java = {"Luis", "Pedro", "Maria", "Juan"}
 3
 4# Union: elements in either set
 5print(python | java)
 6# {'Ana', 'Luis', 'Maria', 'Carlos', 'Pedro', 'Juan'}
 7
 8# Intersection: elements in both
 9print(python & java)
10# {'Luis', 'Maria'}
11
12# Difference: in python but not in java
13print(python - java)
14# {'Ana', 'Carlos'}
15
16# Symmetric difference: in one or the other, but not both
17print(python ^ java)
18# {'Ana', 'Carlos', 'Pedro', 'Juan'}

Set Methods

 1s = {1, 2, 3}
 2
 3# add: add one element
 4s.add(4)
 5print(s)  # {1, 2, 3, 4}
 6
 7# remove: remove (error if not exists)
 8s.remove(2)
 9print(s)  # {1, 3, 4}
10
11# discard: remove (no error if not exists)
12s.discard(10)  # Does nothing
13
14# Check membership (very fast)
15print(3 in s)  # True
Practical use: remove duplicates
1list_with_duplicates = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
2without_duplicates = list(set(list_with_duplicates))
3print(without_duplicates)  # [1, 2, 3, 4]

Dictionaries

Think of your phone’s contact list. When you want to call someone, you don’t scroll through all contacts one by one until you find them. You simply search by name and there’s the number! The name is the key and the phone number is the value.

Python dictionaries work exactly like that: they store key → value pairs. This makes them incredibly fast for finding information: whether you have 10 or 10 million contacts, finding one by name is practically instant.

 1# Create dictionaries
 2person = {
 3    "name": "Ana",
 4    "age": 25,
 5    "city": "Madrid"
 6}
 7
 8# Access values
 9print(person["name"])      # Ana
10print(person.get("age"))   # 25
11print(person.get("country", "Unknown"))  # Unknown (default value)
12
13# Modify values
14person["age"] = 26
15person["country"] = "Spain"  # Add new key
16
17# Delete
18del person["city"]
19value = person.pop("country")  # Remove and return

Iterating Over Dictionaries

Sometimes you need to go through all elements in a dictionary. Python gives you several ways to do it depending on what you need:

 1person = {"name": "Ana", "age": 25, "city": "Madrid"}
 2
 3# Keys only
 4for key in person:
 5    print(key)
 6
 7# Values only
 8for value in person.values():
 9    print(value)
10
11# Keys and values
12for key, value in person.items():
13    print(f"{key}: {value}")

Dictionary Comprehensions

1# Create dictionary of squares
2squares = {x: x**2 for x in range(6)}
3print(squares)  # {0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
4
5# Invert a dictionary
6original = {"a": 1, "b": 2, "c": 3}
7inverted = {v: k for k, v in original.items()}
8print(inverted)  # {1: 'a', 2: 'b', 3: 'c'}

Stacks and Queues

These two structures are fundamental in programming and we use them constantly without realizing it. The difference between them is simply the order in which elements come out.

Stacks - LIFO

Imagine a stack of freshly washed plates. When you put a clean plate, you place it on top of the stack. When you need one, you take the one on top (the last one you put). This is called LIFO: Last In, First Out.

graph LR
    subgraph Stack["Stack of elements"]
        direction TB
        A["third - top (last in)"]
        B["second"]
        C["first - bottom"]
    end
    E["append() adds to top"] --> A
    A --> D["pop() returns 'third'"]

Where are stacks used in real life? The “back” button in your browser is a stack: each page you visit is stacked, and when you press back, you pop the last one.

 1# Use a list as a stack
 2stack = []
 3
 4# push: add to top
 5stack.append("first")
 6stack.append("second")
 7stack.append("third")
 8print(stack)  # ['first', 'second', 'third']
 9
10# pop: remove from top
11element = stack.pop()
12print(element)  # third
13print(stack)    # ['first', 'second']
14
15# peek: view top without removing
16if stack:
17    top = stack[-1]
18    print(top)  # second

Queues - FIFO

Now imagine the checkout line at the supermarket. The first person to arrive is the first to pay and leave. This is FIFO: First In, First Out.

graph LR
    subgraph Queue["Queue of elements"]
        C["third"] --> B["second"] --> A["first"]
    end
    A --> D["exits first"]
    E["new entry"] --> C

Queues are used everywhere: the print queue (documents print in the order you sent them), requests to a web server, messages in a chat app…

 1from collections import deque
 2
 3# Create an efficient queue
 4queue = deque()
 5
 6# enqueue: add to back
 7queue.append("first")
 8queue.append("second")
 9queue.append("third")
10print(queue)  # deque(['first', 'second', 'third'])
11
12# dequeue: remove from front
13element = queue.popleft()
14print(element)  # first
15print(queue)    # deque(['second', 'third'])
Why use deque?

Although we could use list.pop(0) for queues, deque is much more efficient. pop(0) has O(n) complexity because it must move all elements, while popleft() is O(1).


Which Structure to Choose?

After seeing so many options, it’s normal to wonder: “When do I use each one?” Here’s a quick guide:

If you need… Use… Example
Ordered collection you can modify List Playlist, to-do list
Data that shouldn’t change Tuple Coordinates, dates, RGB colors
Unique elements, no duplicates Set Guest list, tags
Look up values by a key Dictionary Contacts, settings, cache
Last in, first out Stack (list) Browser history, undo
First in, first out Queue (deque) Print queue, messages
Practical Tip

When in doubt, start with a list. It’s the most versatile structure. Only switch to another when you have a specific reason: you need keys (dictionary), guarantee uniqueness (set), or immutability (tuple).


Practical Exercises

Exercise 1: Remove Duplicates Keeping Order

Given a list, remove duplicates but keep the original order of appearance.

1input_list = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
2# Expected output: [3, 1, 4, 5, 9, 2, 6]
 1def remove_duplicates(lst):
 2    seen = set()
 3    result = []
 4    for element in lst:
 5        if element not in seen:
 6            seen.add(element)
 7            result.append(element)
 8    return result
 9
10input_list = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
11print(remove_duplicates(input_list))  # [3, 1, 4, 5, 9, 2, 6]
Exercise 2: Word Frequency Counter

Given a text, count how many times each word appears.

1text = "the cat and the dog and the bird"
2# Expected output: {'the': 3, 'cat': 1, 'and': 2, 'dog': 1, 'bird': 1}
1def count_words(text):
2    words = text.lower().split()
3    frequency = {}
4    for word in words:
5        frequency[word] = frequency.get(word, 0) + 1
6    return frequency
7
8text = "the cat and the dog and the bird"
9print(count_words(text))
Exercise 3: Balanced Parentheses

Use a stack to verify if parentheses are balanced.

1"((()))"  # True
2"(()"     # False
3")("      # False
 1def balanced_parentheses(s):
 2    stack = []
 3    for char in s:
 4        if char == '(':
 5            stack.append(char)
 6        elif char == ')':
 7            if not stack:
 8                return False
 9            stack.pop()
10    return len(stack) == 0
11
12print(balanced_parentheses("((()))"))  # True
13print(balanced_parentheses("(()"))     # False
14print(balanced_parentheses(")("))      # False

Quiz

🎮 Quiz: Data Structures

0 / 0
Loading questions...

Previous: Virtual Environment Next: Strings and Dates