Optimization and Complexity

Module Objectives

By completing this module you will be able to:

  • Understand Big-O notation and algorithmic complexity
  • Use profiling tools to identify bottlenecks
  • Apply code optimization techniques
  • Choose the most efficient data structures

Why Optimize?

Your code works. It does what it’s supposed to do. Great! But… is it fast? Does it use memory efficiently? Could it be better?

Think of a car. A compact car and a sports car both take you from point A to point B. Both “work”. But if you need to get there fast, or if gas is expensive, or if you have to make the trip a thousand times a day… suddenly performance matters.

In programming the same thing happens:

  • A mobile app that takes 10 seconds to load loses users
  • A server that processes 100 requests per second instead of 10,000 needs more machines (= more money)
  • A script that takes 8 hours to process data wastes your time
The Golden Rule

“Premature optimization is the root of all evil” - Donald Knuth

Before optimizing, measure. Many programmers waste hours optimizing code that wasn’t the bottleneck. First make it work, then measure where the real problem is, and only then optimize that part.


Algorithmic Complexity

Imagine you have to look up “Garcia Lopez, Maria” in a phone book with a million names. How would you do it?

Option A - Linear search: Start from the first page and go through them one by one until you find her. In the worst case, you’d check all million names. If the book grows to 10 million, it would take 10 times longer.

Option B - Binary search: Open to the middle. “I’m at M… Garcia is before”. Open the first half in the middle. “I’m at F… Garcia is after”. And so on. With each step you eliminate half the options. For a million names, you need about 20 steps. For 10 million, only 24.

Algorithmic complexity measures exactly this: how time (or memory) grows as the problem size grows. And the difference can be brutal.

Big-O Notation

Big-O notation is the standard way to express complexity. It ignores constants and focuses on how the algorithm scales:

Notation Name In plain English Example
O(1) Constant “Always equally fast” List access by index
O(log n) Logarithmic “Grows very slowly” Binary search
O(n) Linear “Grows proportionally” Traverse a list
O(n log n) Linearithmic “A bit worse than linear” Sorting with mergesort
O(n²) Quadratic “Shoots up fast” Nested loops
O(2ⁿ) Exponential “Explodes” Recursive Fibonacci
graph LR
    subgraph Growth["How time grows with n elements"]
        A["O(1): ━━━"]
        B["O(log n): ━━━━"]
        C["O(n): ━━━━━━━━"]
        D["O(n²): ━━━━━━━━━━━━━━━━━━━━"]
    end
 1# O(1) - Constant
 2def direct_access(lst, index):
 3    return lst[index]
 4
 5# O(n) - Linear
 6def linear_search(lst, element):
 7    for item in lst:
 8        if item == element:
 9            return True
10    return False
11
12# O(n²) - Quadratic
13def has_duplicates(lst):
14    for i in range(len(lst)):
15        for j in range(i + 1, len(lst)):
16            if lst[i] == lst[j]:
17                return True
18    return False

Complexity Comparison

 1import time
 2
 3def measure(func, *args):
 4    start = time.perf_counter()
 5    result = func(*args)
 6    end = time.perf_counter()
 7    return end - start, result
 8
 9# Demo: linear search vs set
10numbers = list(range(1000000))
11number_set = set(numbers)
12search_for = 999999
13
14# O(n) - List
15list_time, _ = measure(lambda: search_for in numbers)
16print(f"List: {list_time:.6f}s")
17
18# O(1) - Set
19set_time, _ = measure(lambda: search_for in number_set)
20print(f"Set: {set_time:.6f}s")

Profiling: Diagnosis Before the Prescription

Imagine you go to the doctor because you feel tired. A bad doctor would prescribe vitamins without further ado. A good doctor would first do blood tests, measure your blood pressure, ask about your sleep… Diagnose before prescribing.

With code it’s the same. Before optimizing, you need to know where the problem is. 90% of execution time is usually concentrated in 10% of the code. If you optimize the wrong part, you waste time.

Python gives you several “diagnostic” tools:

timeit: The Precision Stopwatch

For comparing alternatives and measuring times accurately:

 1import timeit
 2
 3# Measure expression time
 4time_taken = timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)
 5print(f"Time: {time_taken:.4f}s")
 6
 7# Compare alternatives
 8setup = "text = 'hello world' * 1000"
 9
10time1 = timeit.timeit("len(text)", setup=setup, number=100000)
11time2 = timeit.timeit("text.__len__()", setup=setup, number=100000)
12
13print(f"len(): {time1:.4f}s")
14print(f"__len__(): {time2:.4f}s")

cProfile: The X-Ray of Your Code

When you need to see which functions consume the most time:

 1import cProfile
 2import pstats
 3
 4def slow_function():
 5    total = 0
 6    for i in range(10000):
 7        for j in range(100):
 8            total += i * j
 9    return total
10
11# Run with profiling
12profiler = cProfile.Profile()
13profiler.enable()
14result = slow_function()
15profiler.disable()
16
17# View statistics
18stats = pstats.Stats(profiler)
19stats.sort_stats('cumulative')
20stats.print_stats(10)  # Top 10 functions

memory_profiler: The Consumption Meter

Sometimes the problem isn’t time, but memory. Your program can be fast but consume gigabytes of RAM. To detect this:

1from memory_profiler import profile
2
3@profile
4def memory_function():
5    large_list = [i ** 2 for i in range(100000)]
6    return sum(large_list)
7
8result = memory_function()

Optimization Techniques

Once you’ve identified the bottleneck, here are the most effective techniques to fix it.

1. Choose the Right Tool for Each Job

You wouldn’t use a screwdriver as a hammer, would you? Well in programming, using the wrong data structure is exactly that.

 1# Bad: search in list O(n)
 2users = ["ana", "luis", "maria", "pedro"]
 3if "maria" in users:  # Slow for large lists
 4    pass
 5
 6# Better: use set O(1)
 7users_set = {"ana", "luis", "maria", "pedro"}
 8if "maria" in users_set:  # Very fast
 9    pass
10
11# Bad: concatenate strings in loop
12result = ""
13for i in range(1000):
14    result += str(i)  # Creates new string each time
15
16# Better: use join
17result = "".join(str(i) for i in range(1000))

2. Don’t Do the Same Work Twice

If you already counted how many guests are at the party, why count them again every time someone asks?

 1# Bad: calculate len() each iteration
 2lst = list(range(10000))
 3for i in range(len(lst)):  # len() called many times
 4    pass
 5
 6# Better: calculate once
 7length = len(lst)
 8for i in range(length):
 9    pass
10
11# Or even better: iterate directly
12for element in lst:
13    pass

3. Let the Experts Do the Heavy Lifting

Python’s built-in functions (sum, max, min, sorted, etc.) are implemented in C and optimized by experts. Using them is like hiring a professional instead of doing it yourself by hand.

 1import time
 2
 3numbers = list(range(1000000))
 4
 5# Bad: own implementation
 6start = time.perf_counter()
 7maximum = numbers[0]
 8for n in numbers:
 9    if n > maximum:
10        maximum = n
11print(f"Manual: {time.perf_counter() - start:.4f}s")
12
13# Better: built-in function (implemented in C)
14start = time.perf_counter()
15maximum = max(numbers)
16print(f"max(): {time.perf_counter() - start:.4f}s")

4. Comprehensions: The Pythonic Way

List comprehensions are not only more readable, they’re also faster because Python optimizes them internally.

 1import timeit
 2
 3# Traditional loop
 4def with_loop():
 5    result = []
 6    for i in range(1000):
 7        result.append(i ** 2)
 8    return result
 9
10# List comprehension
11def with_comprehension():
12    return [i ** 2 for i in range(1000)]
13
14print(f"Loop: {timeit.timeit(with_loop, number=10000):.4f}s")
15print(f"Comprehension: {timeit.timeit(with_comprehension, number=10000):.4f}s")

5. Remember What You Already Calculated (Memoization)

Remember when in school you wrote down the answers to exercises you’d already solved? That’s exactly what memoization is: saving results to avoid recalculating them.

 1from functools import lru_cache
 2
 3# Without cache: O(2^n) - very slow
 4def fib_slow(n):
 5    if n < 2:
 6        return n
 7    return fib_slow(n-1) + fib_slow(n-2)
 8
 9# With cache: O(n) - very fast
10@lru_cache(maxsize=None)
11def fib_fast(n):
12    if n < 2:
13        return n
14    return fib_fast(n-1) + fib_fast(n-2)
15
16import time
17
18start = time.perf_counter()
19print(fib_fast(35))
20print(f"With cache: {time.perf_counter() - start:.4f}s")

6. Generators: Produce on Demand

As we saw in the functions chapter, generators are perfect when you work with large data. Instead of creating a giant list in memory, they produce values one by one.

1import sys
2
3# List: uses memory
4lst = [i ** 2 for i in range(1000000)]
5print(f"List: {sys.getsizeof(lst) / 1024 / 1024:.2f} MB")
6
7# Generator: minimal memory
8gen = (i ** 2 for i in range(1000000))
9print(f"Generator: {sys.getsizeof(gen)} bytes")

Common Mistakes When Optimizing

Traps to Avoid
  1. Optimizing before measuring: “I think this function is slow” isn’t enough. Use profiling to confirm where the problem really is.

  2. Optimizing what doesn’t matter: If a function takes 0.001 seconds and is called once, optimizing it doesn’t make sense even if you could make it 10 times faster.

  3. Sacrificing readability for micro-optimizations: Code that’s 5% faster but impossible to maintain is a bad deal. Clarity almost always wins.

  4. Ignoring algorithmic complexity: No micro-optimization will save an O(n²) algorithm when n grows. First fix the algorithm, then optimize the details.


Quick Guide: What to Optimize?

If your problem is… The typical solution is…
Slow searches in large lists Use set or dict instead of list
Concatenating strings in loop Use “".join()
Expensive function called many times @lru_cache or manual memoization
Processing huge files Generators and reading in chunks
Slow loops with transformations List comprehensions
Calculating the same thing repeatedly Save the result in a variable
Pure Python code too slow Consider NumPy, Cython or PyPy
The 80/20 Rule

80% of performance improvements come from 20% of the code. Find that 20% with profiling before touching anything.


Practical Exercises

Exercise 1: Optimize Duplicate Finding

Optimize this function that finds duplicates in a list:

1def find_duplicates(lst):
2    duplicates = []
3    for i in range(len(lst)):
4        for j in range(i + 1, len(lst)):
5            if lst[i] == lst[j] and lst[i] not in duplicates:
6                duplicates.append(lst[i])
7    return duplicates
 1def find_duplicates_optimized(lst):
 2    seen = set()
 3    duplicates = set()
 4    for element in lst:
 5        if element in seen:
 6            duplicates.add(element)
 7        else:
 8            seen.add(element)
 9    return list(duplicates)
10
11# Original version is O(n³), optimized is O(n)
Exercise 2: Compare Performance

Measure and compare the performance of these two ways to create a dictionary:

  1. Using a for loop
  2. Using dictionary comprehension
 1import timeit
 2
 3def with_loop():
 4    result = {}
 5    for i in range(1000):
 6        result[i] = i ** 2
 7    return result
 8
 9def with_comprehension():
10    return {i: i ** 2 for i in range(1000)}
11
12loop_time = timeit.timeit(with_loop, number=10000)
13comp_time = timeit.timeit(with_comprehension, number=10000)
14
15print(f"Loop: {loop_time:.4f}s")
16print(f"Comprehension: {comp_time:.4f}s")
17print(f"Difference: {abs(loop_time - comp_time) / loop_time * 100:.1f}%")

Quiz

🎮 Quiz: Optimization

0 / 0
Loading questions...

Previous: OOP Next: Parallelism