2.0 KiB
2.0 KiB
Scenario 1: Python to C Optimization
Learning Objectives
- Use
timeto measure execution time - Profile Python code with
cProfile - Generate flamegraphs with
py-spy - Use
ctypesto call C code from Python - Understand ctypes call overhead and when to move loops to C
Files
prime_slow.py- Pure Python implementation (slow)prime.c- C implementation of the hot functionprime_fastish.py- Python loop calling Cis_primevia ctypesprime_fast.py- Entire counting loop in C via ctypes
Exercises
Step 1: Measure the baseline
time python3 prime_slow.py
Note the real time (wall clock) and user time (CPU time in user space).
Step 2: Profile with cProfile
python3 -m cProfile -s cumtime prime_slow.py
Look for:
- Which function has the highest
cumtime(cumulative time)? - How many times (
ncalls) isis_primecalled?
Step 3: Generate a flamegraph
py-spy record -o prime_slow.svg -- python3 prime_slow.py
Open prime_slow.svg in a browser. The widest bar at the top shows where time is spent.
Step 4: Compile and run the partially optimized version
# Compile the C library
gcc -O2 -fPIC -shared -o libprime.so prime.c
# Run the "fastish" version (C is_prime, Python loop)
time python3 prime_fastish.py
This version calls the C is_prime function, but the counting loop is still in Python.
Step 5: Run the fully optimized version
time python3 prime_fast.py
This version calls count_primes entirely in C - no Python loop.
Step 6: Compare
- How much faster is
prime_fastish.pyvsprime_slow.py? - How much faster is
prime_fast.pyvsprime_fastish.py? - Generate flamegraphs for both - what's different?
Discussion Questions
- Why is the C version faster? (Hint: interpreter overhead, type checking)
- Why is
prime_fast.pyfaster thanprime_fastish.py? (Hint: ctypes call overhead) - When is it worth rewriting in C vs. finding a library?
- What's the trade-off of using
ctypesvs native Python?