2.0 KiB

Scenario 1: Python to C Optimization

Learning Objectives

  • Use time to measure execution time
  • Profile Python code with cProfile
  • Generate flamegraphs with py-spy
  • Use ctypes to call C code from Python
  • Understand ctypes call overhead and when to move loops to C

Files

  • prime_slow.py - Pure Python implementation (slow)
  • prime.c - C implementation of the hot function
  • prime_fastish.py - Python loop calling C is_prime via ctypes
  • prime_fast.py - Entire counting loop in C via ctypes

Exercises

Step 1: Measure the baseline

time python3 prime_slow.py

Note the real time (wall clock) and user time (CPU time in user space).

Step 2: Profile with cProfile

python3 -m cProfile -s cumtime prime_slow.py

Look for:

  • Which function has the highest cumtime (cumulative time)?
  • How many times (ncalls) is is_prime called?

Step 3: Generate a flamegraph

py-spy record -o prime_slow.svg -- python3 prime_slow.py

Open prime_slow.svg in a browser. The widest bar at the top shows where time is spent.

Step 4: Compile and run the partially optimized version

# Compile the C library
gcc -O2 -fPIC -shared -o libprime.so prime.c

# Run the "fastish" version (C is_prime, Python loop)
time python3 prime_fastish.py

This version calls the C is_prime function, but the counting loop is still in Python.

Step 5: Run the fully optimized version

time python3 prime_fast.py

This version calls count_primes entirely in C - no Python loop.

Step 6: Compare

  • How much faster is prime_fastish.py vs prime_slow.py?
  • How much faster is prime_fast.py vs prime_fastish.py?
  • Generate flamegraphs for both - what's different?

Discussion Questions

  1. Why is the C version faster? (Hint: interpreter overhead, type checking)
  2. Why is prime_fast.py faster than prime_fastish.py? (Hint: ctypes call overhead)
  3. When is it worth rewriting in C vs. finding a library?
  4. What's the trade-off of using ctypes vs native Python?