Optimization and Complexity
By completing this module you will be able to:
- Understand Big-O notation and algorithmic complexity
- Use profiling tools to identify bottlenecks
- Apply code optimization techniques
- Choose the most efficient data structures
Why Optimize?
Your code works. It does what it’s supposed to do. Great! But… is it fast? Does it use memory efficiently? Could it be better?
Think of a car. A compact car and a sports car both take you from point A to point B. Both “work”. But if you need to get there fast, or if gas is expensive, or if you have to make the trip a thousand times a day… suddenly performance matters.
In programming the same thing happens:
- A mobile app that takes 10 seconds to load loses users
- A server that processes 100 requests per second instead of 10,000 needs more machines (= more money)
- A script that takes 8 hours to process data wastes your time
“Premature optimization is the root of all evil” - Donald Knuth
Before optimizing, measure. Many programmers waste hours optimizing code that wasn’t the bottleneck. First make it work, then measure where the real problem is, and only then optimize that part.
Algorithmic Complexity
Imagine you have to look up “Garcia Lopez, Maria” in a phone book with a million names. How would you do it?
Option A - Linear search: Start from the first page and go through them one by one until you find her. In the worst case, you’d check all million names. If the book grows to 10 million, it would take 10 times longer.
Option B - Binary search: Open to the middle. “I’m at M… Garcia is before”. Open the first half in the middle. “I’m at F… Garcia is after”. And so on. With each step you eliminate half the options. For a million names, you need about 20 steps. For 10 million, only 24.
Algorithmic complexity measures exactly this: how time (or memory) grows as the problem size grows. And the difference can be brutal.
Big-O Notation
Big-O notation is the standard way to express complexity. It ignores constants and focuses on how the algorithm scales:
| Notation | Name | In plain English | Example |
|---|---|---|---|
| O(1) | Constant | “Always equally fast” | List access by index |
| O(log n) | Logarithmic | “Grows very slowly” | Binary search |
| O(n) | Linear | “Grows proportionally” | Traverse a list |
| O(n log n) | Linearithmic | “A bit worse than linear” | Sorting with mergesort |
| O(n²) | Quadratic | “Shoots up fast” | Nested loops |
| O(2ⁿ) | Exponential | “Explodes” | Recursive Fibonacci |
graph LR
subgraph Growth["How time grows with n elements"]
A["O(1): ━━━"]
B["O(log n): ━━━━"]
C["O(n): ━━━━━━━━"]
D["O(n²): ━━━━━━━━━━━━━━━━━━━━"]
end 1# O(1) - Constant
2def direct_access(lst, index):
3 return lst[index]
4
5# O(n) - Linear
6def linear_search(lst, element):
7 for item in lst:
8 if item == element:
9 return True
10 return False
11
12# O(n²) - Quadratic
13def has_duplicates(lst):
14 for i in range(len(lst)):
15 for j in range(i + 1, len(lst)):
16 if lst[i] == lst[j]:
17 return True
18 return FalseComplexity Comparison
1import time
2
3def measure(func, *args):
4 start = time.perf_counter()
5 result = func(*args)
6 end = time.perf_counter()
7 return end - start, result
8
9# Demo: linear search vs set
10numbers = list(range(1000000))
11number_set = set(numbers)
12search_for = 999999
13
14# O(n) - List
15list_time, _ = measure(lambda: search_for in numbers)
16print(f"List: {list_time:.6f}s")
17
18# O(1) - Set
19set_time, _ = measure(lambda: search_for in number_set)
20print(f"Set: {set_time:.6f}s")Profiling: Diagnosis Before the Prescription
Imagine you go to the doctor because you feel tired. A bad doctor would prescribe vitamins without further ado. A good doctor would first do blood tests, measure your blood pressure, ask about your sleep… Diagnose before prescribing.
With code it’s the same. Before optimizing, you need to know where the problem is. 90% of execution time is usually concentrated in 10% of the code. If you optimize the wrong part, you waste time.
Python gives you several “diagnostic” tools:
timeit: The Precision Stopwatch
For comparing alternatives and measuring times accurately:
1import timeit
2
3# Measure expression time
4time_taken = timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)
5print(f"Time: {time_taken:.4f}s")
6
7# Compare alternatives
8setup = "text = 'hello world' * 1000"
9
10time1 = timeit.timeit("len(text)", setup=setup, number=100000)
11time2 = timeit.timeit("text.__len__()", setup=setup, number=100000)
12
13print(f"len(): {time1:.4f}s")
14print(f"__len__(): {time2:.4f}s")cProfile: The X-Ray of Your Code
When you need to see which functions consume the most time:
1import cProfile
2import pstats
3
4def slow_function():
5 total = 0
6 for i in range(10000):
7 for j in range(100):
8 total += i * j
9 return total
10
11# Run with profiling
12profiler = cProfile.Profile()
13profiler.enable()
14result = slow_function()
15profiler.disable()
16
17# View statistics
18stats = pstats.Stats(profiler)
19stats.sort_stats('cumulative')
20stats.print_stats(10) # Top 10 functionsmemory_profiler: The Consumption Meter
Sometimes the problem isn’t time, but memory. Your program can be fast but consume gigabytes of RAM. To detect this:
1from memory_profiler import profile
2
3@profile
4def memory_function():
5 large_list = [i ** 2 for i in range(100000)]
6 return sum(large_list)
7
8result = memory_function()Optimization Techniques
Once you’ve identified the bottleneck, here are the most effective techniques to fix it.
1. Choose the Right Tool for Each Job
You wouldn’t use a screwdriver as a hammer, would you? Well in programming, using the wrong data structure is exactly that.
1# Bad: search in list O(n)
2users = ["ana", "luis", "maria", "pedro"]
3if "maria" in users: # Slow for large lists
4 pass
5
6# Better: use set O(1)
7users_set = {"ana", "luis", "maria", "pedro"}
8if "maria" in users_set: # Very fast
9 pass
10
11# Bad: concatenate strings in loop
12result = ""
13for i in range(1000):
14 result += str(i) # Creates new string each time
15
16# Better: use join
17result = "".join(str(i) for i in range(1000))2. Don’t Do the Same Work Twice
If you already counted how many guests are at the party, why count them again every time someone asks?
1# Bad: calculate len() each iteration
2lst = list(range(10000))
3for i in range(len(lst)): # len() called many times
4 pass
5
6# Better: calculate once
7length = len(lst)
8for i in range(length):
9 pass
10
11# Or even better: iterate directly
12for element in lst:
13 pass3. Let the Experts Do the Heavy Lifting
Python’s built-in functions (sum, max, min, sorted, etc.) are implemented in C and optimized by experts. Using them is like hiring a professional instead of doing it yourself by hand.
1import time
2
3numbers = list(range(1000000))
4
5# Bad: own implementation
6start = time.perf_counter()
7maximum = numbers[0]
8for n in numbers:
9 if n > maximum:
10 maximum = n
11print(f"Manual: {time.perf_counter() - start:.4f}s")
12
13# Better: built-in function (implemented in C)
14start = time.perf_counter()
15maximum = max(numbers)
16print(f"max(): {time.perf_counter() - start:.4f}s")4. Comprehensions: The Pythonic Way
List comprehensions are not only more readable, they’re also faster because Python optimizes them internally.
1import timeit
2
3# Traditional loop
4def with_loop():
5 result = []
6 for i in range(1000):
7 result.append(i ** 2)
8 return result
9
10# List comprehension
11def with_comprehension():
12 return [i ** 2 for i in range(1000)]
13
14print(f"Loop: {timeit.timeit(with_loop, number=10000):.4f}s")
15print(f"Comprehension: {timeit.timeit(with_comprehension, number=10000):.4f}s")5. Remember What You Already Calculated (Memoization)
Remember when in school you wrote down the answers to exercises you’d already solved? That’s exactly what memoization is: saving results to avoid recalculating them.
1from functools import lru_cache
2
3# Without cache: O(2^n) - very slow
4def fib_slow(n):
5 if n < 2:
6 return n
7 return fib_slow(n-1) + fib_slow(n-2)
8
9# With cache: O(n) - very fast
10@lru_cache(maxsize=None)
11def fib_fast(n):
12 if n < 2:
13 return n
14 return fib_fast(n-1) + fib_fast(n-2)
15
16import time
17
18start = time.perf_counter()
19print(fib_fast(35))
20print(f"With cache: {time.perf_counter() - start:.4f}s")6. Generators: Produce on Demand
As we saw in the functions chapter, generators are perfect when you work with large data. Instead of creating a giant list in memory, they produce values one by one.
1import sys
2
3# List: uses memory
4lst = [i ** 2 for i in range(1000000)]
5print(f"List: {sys.getsizeof(lst) / 1024 / 1024:.2f} MB")
6
7# Generator: minimal memory
8gen = (i ** 2 for i in range(1000000))
9print(f"Generator: {sys.getsizeof(gen)} bytes")Common Mistakes When Optimizing
-
Optimizing before measuring: “I think this function is slow” isn’t enough. Use profiling to confirm where the problem really is.
-
Optimizing what doesn’t matter: If a function takes 0.001 seconds and is called once, optimizing it doesn’t make sense even if you could make it 10 times faster.
-
Sacrificing readability for micro-optimizations: Code that’s 5% faster but impossible to maintain is a bad deal. Clarity almost always wins.
-
Ignoring algorithmic complexity: No micro-optimization will save an O(n²) algorithm when n grows. First fix the algorithm, then optimize the details.
Quick Guide: What to Optimize?
| If your problem is… | The typical solution is… |
|---|---|
| Slow searches in large lists | Use set or dict instead of list |
| Concatenating strings in loop | Use “".join() |
| Expensive function called many times | @lru_cache or manual memoization |
| Processing huge files | Generators and reading in chunks |
| Slow loops with transformations | List comprehensions |
| Calculating the same thing repeatedly | Save the result in a variable |
| Pure Python code too slow | Consider NumPy, Cython or PyPy |
80% of performance improvements come from 20% of the code. Find that 20% with profiling before touching anything.
Practical Exercises
Optimize this function that finds duplicates in a list:
1def find_duplicates(lst):
2 duplicates = []
3 for i in range(len(lst)):
4 for j in range(i + 1, len(lst)):
5 if lst[i] == lst[j] and lst[i] not in duplicates:
6 duplicates.append(lst[i])
7 return duplicatesMeasure and compare the performance of these two ways to create a dictionary:
- Using a for loop
- Using dictionary comprehension