illustris 4fb1bd90db
init
2026-01-08 18:11:30 +05:30

185 lines
4.2 KiB
Markdown

# Scenario 6: USDT Probes - Custom Instrumentation
## Learning Objectives
- Understand the difference between static and dynamic probes
- Add USDT probes to C code
- Trace probes with bpftrace
- See how probes enable production debugging
## Background
**Dynamic probes** (like `perf probe`): Added at runtime, can break on any function
**Static probes** (USDT): Compiled into the binary, designed by the developer
USDT probes are:
- Nearly zero overhead when not traced
- Stable API for debuggers/tracers
- Self-documenting: probe names describe what's happening
- Used by Python, Ruby, MySQL, PostgreSQL, and many other projects
## Prerequisites
```bash
# Install USDT support and bpftrace
sudo apt install systemtap-sdt-dev bpftrace
```
## Files
- `server.c` - Simulated server with USDT probes at key points
## Exercise 1: Build and Run
```bash
make
./server 3 10
```
This processes 3 batches of 10 requests each.
## Exercise 2: List the Probes
```bash
# Using readelf
readelf -n ./server | grep -A2 stapsdt
# Or using perf
sudo perf probe -x ./server --list
```
You should see probes like:
- `myserver:server_start`
- `myserver:request_start`
- `myserver:request_end`
- `myserver:request_error`
## Exercise 3: Trace with bpftrace
### Count requests by type
```bash
# In terminal 1:
sudo bpftrace -e '
usdt:./server:myserver:request_start {
@types[arg1] = count();
}
END {
print(@types);
}
'
# In terminal 2:
./server 5 50
```
### Measure request latency
```bash
sudo bpftrace -e '
usdt:./server:myserver:request_start {
@start[arg0] = nsecs;
}
usdt:./server:myserver:request_end {
$latency = (nsecs - @start[arg0]) / 1000000;
@latency_ms = hist($latency);
delete(@start[arg0]);
}
END {
print(@latency_ms);
}
'
```
### Track errors
```bash
sudo bpftrace -e '
usdt:./server:myserver:request_error {
printf("ERROR: request %d failed with code %d\n", arg0, arg1);
@errors = count();
}
'
```
## Exercise 4: Trace with perf
```bash
# Add probe
sudo perf probe -x ./server 'myserver:request_start'
# Record
sudo perf record -e 'probe_server:*' ./server 3 20
# Report
sudo perf report
```
## How USDT Probes Work
The `DTRACE_PROBE` macro inserts a NOP instruction:
```c
DTRACE_PROBE2(myserver, request_start, req->id, req->type);
```
Compiles to something like:
```asm
nop ; placeholder for probe
```
When you activate tracing:
1. Tracer finds the probe location (stored in ELF notes)
2. Replaces NOP with a trap instruction (INT3 on x86)
3. Trap triggers, tracer runs, returns control
4. When tracing stops, NOP is restored
**Overhead when not tracing**: ~0 (just a NOP)
**Overhead when tracing**: trap + handler execution
## Real-World Uses
### Python has USDT probes:
```bash
# If Python was built with --enable-dtrace
sudo bpftrace -e 'usdt:/usr/bin/python3:python:function__entry { printf("%s\n", str(arg1)); }'
```
### MySQL probes for query tracking:
```bash
sudo bpftrace -e 'usdt:/usr/sbin/mysqld:mysql:query__start { printf("%s\n", str(arg0)); }'
```
## Discussion Questions
1. **When would you use USDT vs dynamic probes?**
- USDT: Known important points, stable interface
- Dynamic: Ad-hoc debugging, no source changes
2. **What's the trade-off of adding probes?**
- Pro: Always available for debugging
- Con: Must plan ahead, adds code complexity
3. **Why not just use printf debugging?**
- Printf has overhead even when you don't need it
- USDT has zero overhead until activated
- USDT can be traced without rebuilding
## Advanced: Creating Custom Probes
The probe macros from `<sys/sdt.h>`:
```c
DTRACE_PROBE(provider, name) // No arguments
DTRACE_PROBE1(provider, name, arg1) // 1 argument
DTRACE_PROBE2(provider, name, arg1, arg2) // 2 arguments
// ... up to DTRACE_PROBE12
```
Arguments can be integers or pointers. Strings need special handling.
## Key Takeaways
1. **USDT probes are designed-in observability**
2. **Zero overhead when not actively tracing**
3. **bpftrace makes probe usage easy**
4. **Many production systems already have probes (Python, databases, etc.)**
This is an advanced topic - the main takeaway for beginners is that
such instrumentation exists and enables powerful production debugging.