# Scenario 3: The Syscall Storm ## Learning Objectives - Understand the cost of syscalls (user-space to kernel-space transitions) - Use `strace -c` to count and profile syscalls - Learn why buffering matters for I/O performance - Understand `time` output: `real`, `user`, `sys` ## Files - `read_slow.c` - Reads file byte-by-byte with raw `read()` syscalls - `read_fast.c` - Reads file in 64KB chunks - `read_stdio.c` - Uses stdio's `fgetc()` (internally buffered) - `read_python.py` - Python equivalents ## Setup ```bash # Compile all C programs and create test file make all ``` ## Exercise 1: Observe the Problem ### Step 1: Time the slow version ```bash time ./read_slow testfile ``` Notice that `sys` time dominates - the CPU is mostly waiting for syscalls! ### Step 2: Count the syscalls ```bash strace -c ./read_slow testfile ``` Look at: - `calls` column for `read`: Should be ~1,000,000 (one per byte!) - `% time` column: Most time in `read` ### Step 3: Compare with fast version ```bash time ./read_fast testfile strace -c ./read_fast testfile ``` The `read` call count drops from ~1,000,000 to ~16 (1MB / 64KB). ## Exercise 2: Understanding the Time Output ```bash time ./read_slow testfile ``` Output explanation: - `real` - Wall clock time (what you'd measure with a stopwatch) - `user` - CPU time in user space (your code running) - `sys` - CPU time in kernel space (syscalls, I/O) For `read_slow`: `sys` >> `user` because we spend most time in the kernel. For `read_fast`: `user` > `sys` because we spend more time processing data. ## Exercise 3: strace Deep Dive ### See individual syscalls (first 50) ```bash strace -T ./read_slow testfile 2>&1 | head -50 ``` The `-T` flag shows time spent in each syscall. ### Filter to just read() calls ```bash strace -e read -c ./read_slow testfile ``` ## Exercise 4: stdio Comparison ```bash time ./read_stdio testfile strace -c ./read_stdio testfile ``` Questions: - How many `read` syscalls does stdio make? - Why is it still slower than `read_fast`? - Hint: Check the default stdio buffer size (usually 4KB or 8KB) ## Exercise 5: Python I/O ```bash # Create smaller file for unbuffered test make smallfile # Run Python comparison python3 read_python.py smallfile # Profile with strace strace -c python3 read_python.py smallfile ``` ## Key Insights ### Why are syscalls expensive? 1. **Context switch**: CPU saves user state, loads kernel state 2. **Security checks**: Kernel validates permissions 3. **Memory barriers**: Caches may need flushing 4. **Scheduling**: Kernel may switch to another process ### The buffering solution Instead of: ``` read(1 byte) → kernel → read(1 byte) → kernel → ... (million times) ``` We do: ``` read(64KB) → kernel → process 64KB in user space → read(64KB) → ... ``` ### Rule of thumb - Syscall overhead: ~100-1000 nanoseconds - Reading 64KB: ~10-100 microseconds (from cache/RAM) - Break-even: buffer should be at least a few KB ## perf stat Comparison ```bash perf stat ./read_slow testfile perf stat ./read_fast testfile ``` Look at: - `context-switches` - `cpu-migrations` - `instructions per cycle` ## Further Exploration 1. What happens with `read(fd, buf, 4096)` vs `read(fd, buf, 65536)`? 2. How does `mmap()` compare? (Memory-mapped I/O) 3. What about `O_DIRECT` flag? (Bypass page cache)