# Scenario 1: Python to C Optimization ## Learning Objectives - Use `time` to measure execution time - Profile Python code with `cProfile` - Generate flamegraphs with `py-spy` - Use `ctypes` to call C code from Python - Understand ctypes call overhead and when to move loops to C ## Files - `prime_slow.py` - Pure Python implementation (slow) - `prime.c` - C implementation of the hot function - `prime_fastish.py` - Python loop calling C `is_prime` via ctypes - `prime_fast.py` - Entire counting loop in C via ctypes ## Exercises ### Step 1: Measure the baseline ```bash time python3 prime_slow.py ``` Note the `real` time (wall clock) and `user` time (CPU time in user space). ### Step 2: Profile with cProfile ```bash python3 -m cProfile -s cumtime prime_slow.py ``` Look for: - Which function has the highest `cumtime` (cumulative time)? - How many times (`ncalls`) is `is_prime` called? ### Step 3: Generate a flamegraph ```bash py-spy record -o prime_slow.svg -- python3 prime_slow.py ``` Open `prime_slow.svg` in a browser. The widest bar at the top shows where time is spent. ### Step 4: Compile and run the partially optimized version ```bash # Compile the C library gcc -O2 -fPIC -shared -o libprime.so prime.c # Run the "fastish" version (C is_prime, Python loop) time python3 prime_fastish.py ``` This version calls the C `is_prime` function, but the counting loop is still in Python. ### Step 5: Run the fully optimized version ```bash time python3 prime_fast.py ``` This version calls `count_primes` entirely in C - no Python loop. ### Step 6: Compare - How much faster is `prime_fastish.py` vs `prime_slow.py`? - How much faster is `prime_fast.py` vs `prime_fastish.py`? - Generate flamegraphs for both - what's different? ## Discussion Questions 1. Why is the C version faster? (Hint: interpreter overhead, type checking) 2. Why is `prime_fast.py` faster than `prime_fastish.py`? (Hint: ctypes call overhead) 3. When is it worth rewriting in C vs. finding a library? 4. What's the trade-off of using `ctypes` vs native Python?