5.6 KiB
5.6 KiB
Linux Performance Tools Cheatsheet
Quick Reference Card for Workshop
time - Basic Timing
time ./program # Wall clock, user, sys time
time -v ./program # Verbose (GNU time, may need /usr/bin/time -v)
Output explained:
real- Wall clock time (what a stopwatch would show)user- CPU time in user space (your code)sys- CPU time in kernel space (syscalls)
perf stat - Hardware Counters
# Basic stats
perf stat ./program
# Specific events
perf stat -e cycles,instructions,cache-misses ./program
# Repeat for statistical accuracy
perf stat -r 5 ./program
# Common events
perf stat -e cycles,instructions,cache-references,cache-misses,branches,branch-misses ./program
Key metrics:
- IPC (Instructions Per Cycle): Higher is better, >1 is good
- Cache miss ratio: Lower is better
- Branch misses: Lower is better
perf record/report - CPU Sampling
# Record samples
perf record ./program
perf record -g ./program # With call graphs
perf record -F 99 ./program # Custom frequency (99 Hz)
# Analyze
perf report # Interactive TUI
perf report --stdio # Text output
perf report -n --stdio # With sample counts
TUI navigation:
- Arrow keys: Navigate
- Enter: Zoom into function
a: Annotate (show source/assembly)q: Quit/back
perf annotate - Source-Level View
# After perf record
perf annotate function_name # Show hot lines
perf annotate -s function_name # Source view (needs -g)
Requires: Compiled with -g for source view
strace - Syscall Tracing
# Summary of syscalls
strace -c ./program
# Trace specific syscalls
strace -e read,write ./program
# With timing per call
strace -T ./program
# Follow forks
strace -f ./program
# Output to file
strace -o trace.log ./program
Key columns in -c output:
% time: Percentage of total syscall timecalls: Number of times callederrors: Failed calls
Flamegraphs
# Clone FlameGraph repo (one time)
git clone https://github.com/brendangregg/FlameGraph.git
# Generate flamegraph
perf record -g ./program
perf script | ./FlameGraph/stackcollapse-perf.pl | ./FlameGraph/flamegraph.pl > profile.svg
# Open in browser
firefox profile.svg
Python Profiling
# cProfile - built-in profiler
python3 -m cProfile script.py
python3 -m cProfile -s cumtime script.py # Sort by cumulative time
python3 -m cProfile -s ncalls script.py # Sort by call count
python3 -m cProfile -o profile.stats script.py # Save for later analysis
# py-spy - sampling profiler (low overhead)
pip install py-spy
py-spy record -o profile.svg -- python3 script.py # Flamegraph
py-spy top -- python3 script.py # Live view
# Attach to running process
py-spy top --pid 12345
/proc Filesystem
# Process info
cat /proc/<pid>/status # Process status
cat /proc/<pid>/maps # Memory mappings
cat /proc/<pid>/fd # Open file descriptors
cat /proc/<pid>/smaps # Detailed memory info
# System info
cat /proc/cpuinfo # CPU details
cat /proc/meminfo # Memory details
cat /proc/loadavg # Load average
htop - System Overview
htop # Interactive process viewer
Key shortcuts:
F6: Sort by columnF5: Tree viewF9: Kill processt: Toggle tree viewH: Toggle user threads
hyperfine - Benchmarking
# Basic benchmark
hyperfine './program'
# Compare two programs
hyperfine './program_v1' './program_v2'
# With warmup
hyperfine --warmup 3 './program'
# Export results
hyperfine --export-markdown results.md './program'
Quick Diagnosis Flowchart
Program is slow
│
▼
┌──────────────────┐
│ time ./program │
└────────┬─────────┘
│
▼
┌─────────────────────────────────┐
│ Is 'sys' time high? │
└─────────┬───────────┬───────────┘
YES NO
│ │
▼ ▼
┌──────────┐ ┌──────────────┐
│ strace -c │ │ perf record │
│ (syscalls)│ │ (CPU profile)│
└──────────┘ └──────────────┘
Permission Issues
# Allow perf for non-root (temporary)
sudo sysctl -w kernel.perf_event_paranoid=1
# Allow perf for non-root (permanent)
echo 'kernel.perf_event_paranoid=1' | sudo tee -a /etc/sysctl.conf
# Run perf as root if needed
sudo perf record ./program
Useful One-Liners
# Top 10 functions by CPU time
perf report -n --stdio | head -20
# Count syscalls by type
strace -c ./program 2>&1 | tail -20
# Watch cache misses in real-time
perf stat -e cache-misses -I 1000 -p <pid>
# Find memory-mapped files for a process
cat /proc/<pid>/maps | grep -v '\[' | awk '{print $6}' | sort -u
# Monitor a Python process
py-spy top --pid $(pgrep -f 'python.*myapp')
Resources
- Brendan Gregg's site: https://www.brendangregg.com/linuxperf.html
- perf wiki: https://perf.wiki.kernel.org/
- FlameGraph repo: https://github.com/brendangregg/FlameGraph
- py-spy docs: https://github.com/benfred/py-spy