# Scenario 2: Memoization and Precomputation ## Learning Objectives - Read cProfile output to identify redundant function calls - Use `@functools.lru_cache` for automatic memoization - Recognize when precomputation beats memoization - Understand space-time trade-offs ## Files - `fib_slow.py` - Naive recursive Fibonacci (exponential time) - `fib_cached.py` - Memoized Fibonacci (linear time) - `config_validator.py` - Comparison of naive, memoized, and precomputed approaches ## Exercise 1: Fibonacci ### Step 1: Experience the slowness ```bash time python3 fib_slow.py 35 ``` This should take several seconds. Don't try n=50! ### Step 2: Profile to understand why ```bash python3 -m cProfile -s ncalls fib_slow.py 35 2>&1 | head -20 ``` Key insight: Look at `ncalls` for the `fib` function. For fib(35), it's called millions of times because we recompute the same values repeatedly. The call tree looks like: ``` fib(5) ├── fib(4) │ ├── fib(3) │ │ ├── fib(2) │ │ └── fib(1) │ └── fib(2) └── fib(3) <-- Same as above! Redundant! ├── fib(2) └── fib(1) ``` ### Step 3: Apply memoization ```bash time python3 fib_cached.py 35 ``` Now try a much larger value: ```bash time python3 fib_cached.py 100 ``` ### Step 4: Verify the improvement ```bash python3 -m cProfile -s ncalls fib_cached.py 35 2>&1 | head -20 ``` The `ncalls` should now be O(n) instead of O(2^n). ## Exercise 2: Config Validator This example shows when precomputation is better than memoization. ### Run all three strategies ```bash python3 config_validator.py 5000 ``` ### Profile to understand the differences ```bash python3 -m cProfile -s cumtime config_validator.py 5000 ``` ### Discussion Questions 1. Why is precomputation faster than memoization here? - Hint: How many unique inputs are there? - Hint: What's the overhead of cache lookup vs dict lookup? 2. When would memoization be better than precomputation? - Hint: What if there were 10,000 rules and 10,000 event types? - Hint: What if we didn't know the inputs ahead of time? 3. What's the memory trade-off? ## Key Takeaways | Approach | When to Use | |----------|-------------| | No caching | Function is cheap OR called once per input | | Memoization | Unknown/large input space, function is expensive | | Precomputation | Known/small input space, amortize cost over many lookups | ## Further Reading - `functools.lru_cache` documentation - `functools.cache` (Python 3.9+) - unbounded cache, simpler API