perf-workshop/README.md
2026-01-10 13:05:09 +05:30

417 lines
11 KiB
Markdown

# Linux Performance Engineering Workshop
## 4-Hour Hands-On Training for BITS Pilani Goa
### Prerequisites
- Basic C programming knowledge
- Basic Python knowledge
- Familiarity with command line
- Ubuntu 22.04/24.04 (or similar Linux)
---
## Workshop Overview
This workshop teaches practical performance engineering skills using libre tools on Linux.
By the end, you'll be able to identify and fix common performance problems.
### What You'll Learn
- How to measure program performance (not guess!)
- CPU profiling with `perf` and flamegraphs
- Identifying syscall overhead with `strace`
- Understanding cache behavior
- Continuous profiling for production systems
### Philosophy
> "Measure, don't guess."
Most performance intuitions are wrong. This workshop teaches you to find bottlenecks with data.
---
## Schedule
| Time | Topic | Hands-On |
|------|-------|----------|
| 0:00-0:45 | Introduction & Theory | - |
| 0:45-1:30 | Python Profiling | Scenarios 1 & 2 |
| 1:30-1:45 | Break | - |
| 1:45-2:30 | perf & Flamegraphs | Theory + Demo |
| 2:30-3:30 | Cache & Debug Symbols | Scenarios 4 & 5 |
| 3:30-4:00 | Lunch Break | - |
| 4:00-4:30 | Syscalls & I/O | Theory |
| 4:30-5:15 | Syscall Profiling | Scenario 3 |
| 5:15-5:30 | Break | - |
| 5:30-6:00 | Advanced Topics & Wrap-up | Scenarios 6 & 7 |
---
## Setup Instructions
### Install Required Packages
```bash
# Core tools
sudo apt update
sudo apt install -y \
build-essential \
linux-tools-common \
linux-tools-$(uname -r) \
strace \
ltrace \
htop \
python3-pip
# Optional but recommended
sudo apt install -y \
hyperfine \
valgrind \
systemtap-sdt-dev
# Python tools
pip3 install py-spy
# Pyroscope (for scenario 7)
# Option A: Docker
docker pull grafana/pyroscope
# Option B: Download binary from https://github.com/grafana/pyroscope/releases
# FlameGraph scripts
cd ~
git clone https://github.com/brendangregg/FlameGraph.git
```
### Configure perf Permissions
```bash
# Allow perf for non-root users (needed for this workshop)
sudo sysctl -w kernel.perf_event_paranoid=1
# To make permanent:
echo 'kernel.perf_event_paranoid=1' | sudo tee -a /etc/sysctl.conf
```
### Verify Installation
```bash
# Should all work without errors:
perf --version
strace --version
py-spy --version
gcc --version
python3 --version
```
---
## Nix-Based Setup (Alternative)
This workshop includes a Nix flake for reproducible environments. Use this if you have Nix installed or want a pre-configured bootable image.
### Quick Reference
| Goal | Command |
|------|---------|
| Dev shell with all tools | `nix develop` |
| Apply tools to Ubuntu | `nix run github:numtide/system-manager -- switch --flake .` |
| Build bootable USB ISO | `nix build .#iso` |
| Build netboot files | `nix build .#netboot` |
### Development Shell
Get all workshop tools in your current shell without installing anything system-wide:
```bash
cd perf-workshop
nix develop
# Now you have: perf, strace, py-spy, bpftrace, hyperfine, valgrind, flamegraph, pyroscope
perf --version
py-spy --help
```
### System-Manager (Ubuntu/Debian)
Install all workshop tools on an existing Ubuntu system using Nix:
```bash
# Apply the configuration (installs tools via Nix)
nix run 'github:numtide/system-manager' -- switch --flake .
# Configure perf permissions (system-manager can't do this)
sudo /etc/perf-workshop-setup.sh
```
This installs tools into `/nix/store` and adds them to your PATH without conflicting with apt packages.
### Bootable USB Image
Build a complete NixOS image with XFCE desktop, all tools pre-installed, and workshop materials:
```bash
# Build the ISO (~4-5 GB)
nix build .#iso
# Flash to USB (replace sdX with your device)
sudo dd if=result/iso/*.iso of=/dev/sdX bs=4M status=progress conv=fsync
```
**ISO Features:**
- XFCE desktop with auto-login (user: `workshop`, password: `workshop`)
- `copytoram` enabled — boots from USB, runs entirely from RAM (USB can be removed after boot)
- `kernel.perf_event_paranoid=1` pre-configured
- Workshop materials in `/home/workshop/perf-workshop`
- Desktop shortcut to open terminal in workshop directory
- SSH enabled for remote access
**Requirements:** 8+ GB RAM recommended (the system runs from RAM)
### Netboot over LAN
For workshops with many participants, netboot is more efficient than flashing multiple USBs.
```bash
# Build netboot bundle
nix build .#netboot
cd result
# Contents:
# - bzImage (kernel)
# - initrd (initrd with full system, ~2-4 GB)
# - netboot.ipxe (iPXE boot script)
```
**Option 1: Pixiecore (easiest)**
Pixiecore is an all-in-one PXE server — just point it at the files:
```bash
nix shell nixpkgs#pixiecore
# Serve on your LAN (requires root for DHCP proxy)
sudo pixiecore boot bzImage initrd \
--cmdline "$(grep -oP 'imgargs.*? \K.*' netboot.ipxe)"
```
Participants set their BIOS to network boot and get the workshop environment automatically.
**Option 2: dnsmasq + HTTP server**
For more control or integration with existing infrastructure:
```bash
# Terminal 1: Serve files over HTTP
python3 -m http.server 8080
```
Configure dnsmasq (`/etc/dnsmasq.d/workshop.conf`):
```ini
interface=eth0
dhcp-range=192.168.1.100,192.168.1.200,12h
enable-tftp
tftp-root=/path/to/result
dhcp-boot=netboot.ipxe
```
**Option 3: Existing PXE infrastructure**
Copy files to your TFTP/HTTP server and configure your DHCP server to serve `netboot.ipxe`.
### Flake Outputs Reference
```bash
# List all outputs
nix flake show
# Available outputs:
# - devShells.x86_64-linux.default # Development shell
# - packages.x86_64-linux.iso # Bootable ISO image
# - packages.x86_64-linux.netboot # Netboot bundle (kernel + initrd + ipxe)
# - packages.x86_64-linux.netboot-kernel
# - packages.x86_64-linux.netboot-initrd
# - packages.x86_64-linux.netboot-ipxe
# - nixosConfigurations.workshop-iso # NixOS config for ISO
# - nixosConfigurations.workshop-netboot # NixOS config for netboot
# - systemConfigs.default # system-manager config for Ubuntu
```
---
## Directory Structure
```
perf-workshop/
├── README.md # This file
├── flake.nix # Nix flake (dev shell, ISO, netboot)
├── flake.lock # Locked dependencies
├── nix/
│ ├── packages.nix # Shared package list
│ ├── common.nix # Common NixOS configuration
│ ├── iso.nix # ISO-specific configuration
│ ├── netboot.nix # Netboot-specific configuration
│ └── system-manager.nix # Ubuntu system-manager module
├── common/
│ └── CHEATSHEET.md # Quick reference card
├── scenario1-python-to-c/
│ ├── README.md
│ ├── prime_slow.py # Slow Python version
│ ├── prime.c # C implementation
│ └── prime_fast.py # Python + C via ctypes
├── scenario2-memoization/
│ ├── README.md
│ ├── fib_slow.py # Naive recursive Fibonacci
│ ├── fib_cached.py # Memoized Fibonacci
│ └── config_validator.py # Precomputation example
├── scenario3-syscall-storm/
│ ├── README.md
│ ├── Makefile
│ ├── read_slow.c # Byte-by-byte reads
│ ├── read_fast.c # Buffered reads
│ ├── read_stdio.c # stdio buffering
│ └── read_python.py # Python equivalent
├── scenario4-cache-misses/
│ ├── README.md
│ ├── Makefile
│ ├── cache_demo.c # Row vs column major
│ └── list_vs_array.c # Array vs linked list
├── scenario5-debug-symbols/
│ ├── README.md
│ ├── Makefile
│ └── program.c # Multi-function program
├── scenario6-usdt-probes/
│ ├── README.md
│ ├── Makefile
│ └── server.c # Program with USDT probes
└── scenario7-pyroscope/
├── README.md
├── requirements.txt
├── app.py # Flask app with Pyroscope
└── loadgen.sh # Load generator script
```
---
## Quick Start
### Build Everything
```bash
# Build all C programs
for dir in scenario{3,4,5,6}*/; do
if [ -f "$dir/Makefile" ]; then
echo "Building $dir"
make -C "$dir"
fi
done
# Build scenario 1 C library
cd scenario1-python-to-c
gcc -O2 -fPIC -shared -o libprime.so prime.c
cd ..
```
### Run a Scenario
Each scenario has its own README with step-by-step instructions.
Start with:
```bash
cd scenario1-python-to-c
cat README.md
```
---
## Key Concepts Summary
### 1. Types of Bottlenecks
| Type | Symptom | Tool |
|------|---------|------|
| CPU-bound | `user` time is high | `perf record` |
| Syscall-bound | `sys` time is high | `strace -c` |
| I/O-bound | Low CPU, slow wall time | `strace`, `iostat` |
| Memory-bound | High cache misses | `perf stat` |
### 2. Profiling Workflow
```
1. Measure: time ./program
2. Hypothesize: Where is time spent?
3. Profile: perf/strace/cProfile
4. Analyze: Find hot spots
5. Optimize: Fix the bottleneck
6. Verify: Re-measure
```
### 3. Tool Selection
| Task | Tool |
|------|------|
| Basic timing | `time` |
| CPU sampling | `perf record` |
| Hardware counters | `perf stat` |
| Syscall tracing | `strace -c` |
| Python profiling | `cProfile`, `py-spy` |
| Visualization | Flamegraphs |
| Continuous profiling | Pyroscope |
---
## Further Learning
### Books
- "Systems Performance" by Brendan Gregg
- "BPF Performance Tools" by Brendan Gregg
### Online Resources
- https://www.brendangregg.com/linuxperf.html
- https://perf.wiki.kernel.org/
- https://jvns.ca/blog/2016/03/12/how-does-perf-work-and-some-questions/
### Tools to Explore Later
- `bpftrace` - High-level tracing language
- `eBPF` - In-kernel programmability
- `Valgrind` - Memory profiling
- `gprof` - Traditional profiler
---
## Troubleshooting
### "perf: command not found"
```bash
sudo apt install linux-tools-common linux-tools-$(uname -r)
```
### "Access to performance monitoring operations is limited"
```bash
sudo sysctl -w kernel.perf_event_paranoid=1
```
### "py-spy: Permission denied"
Either run as root or use `--nonblocking`:
```bash
sudo py-spy record -o profile.svg -- python3 script.py
# Or:
py-spy record --nonblocking -o profile.svg -- python3 script.py
```
### "No debug symbols"
Recompile with `-g`:
```bash
gcc -O2 -g -o program program.c
```
---
## Feedback
Found an issue? Have suggestions?
Please provide feedback to your instructor!
---
*Workshop materials prepared for BITS Pilani Goa*
*Tools: All libre/open-source software*