Linux Performance Engineering Workshop
4-Hour Hands-On Training for BITS Pilani Goa
Prerequisites
- Basic C programming knowledge
- Basic Python knowledge
- Familiarity with command line
- Ubuntu 22.04/24.04 (or similar Linux)
Workshop Overview
This workshop teaches practical performance engineering skills using libre tools on Linux. By the end, you'll be able to identify and fix common performance problems.
What You'll Learn
- How to measure program performance (not guess!)
- CPU profiling with
perfand flamegraphs - Identifying syscall overhead with
strace - Understanding cache behavior
- Continuous profiling for production systems
Philosophy
"Measure, don't guess."
Most performance intuitions are wrong. This workshop teaches you to find bottlenecks with data.
Schedule
| Time | Topic | Hands-On |
|---|---|---|
| 0:00-0:45 | Introduction & Theory | - |
| 0:45-1:30 | Python Profiling | Scenarios 1 & 2 |
| 1:30-1:45 | Break | - |
| 1:45-2:30 | perf & Flamegraphs | Theory + Demo |
| 2:30-3:30 | Cache & Debug Symbols | Scenarios 4 & 5 |
| 3:30-4:00 | Lunch Break | - |
| 4:00-4:30 | Syscalls & I/O | Theory |
| 4:30-5:15 | Syscall Profiling | Scenario 3 |
| 5:15-5:30 | Break | - |
| 5:30-6:00 | Advanced Topics & Wrap-up | Scenarios 6 & 7 |
Setup Instructions
Install Required Packages
# Core tools
sudo apt update
sudo apt install -y \
build-essential \
linux-tools-common \
linux-tools-$(uname -r) \
strace \
ltrace \
htop \
python3-pip
# Optional but recommended
sudo apt install -y \
hyperfine \
valgrind \
systemtap-sdt-dev
# Python tools
pip3 install py-spy
# Pyroscope (for scenario 7)
# Option A: Docker
docker pull grafana/pyroscope
# Option B: Download binary from https://github.com/grafana/pyroscope/releases
# FlameGraph scripts
cd ~
git clone https://github.com/brendangregg/FlameGraph.git
Configure perf Permissions
# Allow perf for non-root users (needed for this workshop)
sudo sysctl -w kernel.perf_event_paranoid=1
# To make permanent:
echo 'kernel.perf_event_paranoid=1' | sudo tee -a /etc/sysctl.conf
Verify Installation
# Should all work without errors:
perf --version
strace --version
py-spy --version
gcc --version
python3 --version
Directory Structure
perf-workshop/
├── README.md # This file
├── common/
│ └── CHEATSHEET.md # Quick reference card
├── scenario1-python-to-c/
│ ├── README.md
│ ├── prime_slow.py # Slow Python version
│ ├── prime.c # C implementation
│ └── prime_fast.py # Python + C via ctypes
├── scenario2-memoization/
│ ├── README.md
│ ├── fib_slow.py # Naive recursive Fibonacci
│ ├── fib_cached.py # Memoized Fibonacci
│ └── config_validator.py # Precomputation example
├── scenario3-syscall-storm/
│ ├── README.md
│ ├── Makefile
│ ├── read_slow.c # Byte-by-byte reads
│ ├── read_fast.c # Buffered reads
│ ├── read_stdio.c # stdio buffering
│ └── read_python.py # Python equivalent
├── scenario4-cache-misses/
│ ├── README.md
│ ├── Makefile
│ ├── cache_demo.c # Row vs column major
│ └── list_vs_array.c # Array vs linked list
├── scenario5-debug-symbols/
│ ├── README.md
│ ├── Makefile
│ └── program.c # Multi-function program
├── scenario6-usdt-probes/
│ ├── README.md
│ ├── Makefile
│ └── server.c # Program with USDT probes
└── scenario7-pyroscope/
├── README.md
├── requirements.txt
├── app.py # Flask app with Pyroscope
└── loadgen.sh # Load generator script
Quick Start
Build Everything
# Build all C programs
for dir in scenario{3,4,5,6}*/; do
if [ -f "$dir/Makefile" ]; then
echo "Building $dir"
make -C "$dir"
fi
done
# Build scenario 1 C library
cd scenario1-python-to-c
gcc -O2 -fPIC -shared -o libprime.so prime.c
cd ..
Run a Scenario
Each scenario has its own README with step-by-step instructions. Start with:
cd scenario1-python-to-c
cat README.md
Key Concepts Summary
1. Types of Bottlenecks
| Type | Symptom | Tool |
|---|---|---|
| CPU-bound | user time is high |
perf record |
| Syscall-bound | sys time is high |
strace -c |
| I/O-bound | Low CPU, slow wall time | strace, iostat |
| Memory-bound | High cache misses | perf stat |
2. Profiling Workflow
1. Measure: time ./program
2. Hypothesize: Where is time spent?
3. Profile: perf/strace/cProfile
4. Analyze: Find hot spots
5. Optimize: Fix the bottleneck
6. Verify: Re-measure
3. Tool Selection
| Task | Tool |
|---|---|
| Basic timing | time |
| CPU sampling | perf record |
| Hardware counters | perf stat |
| Syscall tracing | strace -c |
| Python profiling | cProfile, py-spy |
| Visualization | Flamegraphs |
| Continuous profiling | Pyroscope |
Further Learning
Books
- "Systems Performance" by Brendan Gregg
- "BPF Performance Tools" by Brendan Gregg
Online Resources
- https://www.brendangregg.com/linuxperf.html
- https://perf.wiki.kernel.org/
- https://jvns.ca/blog/2016/03/12/how-does-perf-work-and-some-questions/
Tools to Explore Later
bpftrace- High-level tracing languageeBPF- In-kernel programmabilityValgrind- Memory profilinggprof- Traditional profiler
Troubleshooting
"perf: command not found"
sudo apt install linux-tools-common linux-tools-$(uname -r)
"Access to performance monitoring operations is limited"
sudo sysctl -w kernel.perf_event_paranoid=1
"py-spy: Permission denied"
Either run as root or use --nonblocking:
sudo py-spy record -o profile.svg -- python3 script.py
# Or:
py-spy record --nonblocking -o profile.svg -- python3 script.py
"No debug symbols"
Recompile with -g:
gcc -O2 -g -o program program.c
Feedback
Found an issue? Have suggestions? Please provide feedback to your instructor!
Workshop materials prepared for BITS Pilani Goa Tools: All libre/open-source software
Description
Languages
Python
40%
C
37.5%
Nix
15.5%
Makefile
4.2%
Shell
2.8%