Performance Optimization Fundamentals
Before diving into specific techniques, let’s establish some fundamental principles:
The Optimization Process
1. Measure - Establish a baseline and identify bottlenecks
2. Analyze - Understand why the bottlenecks exist
3. Improve - Make targeted changes to address the bottlenecks
4. Verify - Measure again to confirm improvements
5. Repeat - Continue until performance goals are met
Premature Optimization
As Donald Knuth famously said, “Premature optimization is the root of all evil.” Focus on writing clear, correct code first, then optimize where necessary:
// First version: Clear and correct
fn calculate_average(numbers: &[f64]) -> f64 {
let sum: f64 = numbers.iter().sum();
sum / numbers.len() as f64
}
// Optimized version (only after profiling shows it's needed)
fn calculate_average_optimized(numbers: &[f64]) -> f64 {
let mut sum = 0.0;
let len = numbers.len();
// Manual loop unrolling for better performance
let mut i = 0;
while i + 4 <= len {
sum += numbers[i] + numbers[i + 1] + numbers[i + 2] + numbers[i + 3];
i += 4;
}
// Handle remaining elements
while i < len {
sum += numbers[i];
i += 1;
}
sum / len as f64
}
Profiling and Benchmarking
Before optimizing, you need to identify where your code is spending time:
Benchmarking with Criterion
use criterion::{black_box, criterion_group, criterion_main, Criterion};
fn fibonacci(n: u64) -> u64 {
match n {
0 => 0,
1 => 1,
_ => fibonacci(n - 1) + fibonacci(n - 2),
}
}
fn fibonacci_iterative(n: u64) -> u64 {
if n <= 1 {
return n;
}
let mut a = 0;
let mut b = 1;
for _ in 2..=n {
let temp = a + b;
a = b;
b = temp;
}
b
}
fn criterion_benchmark(c: &mut Criterion) {
let mut group = c.benchmark_group("fibonacci");
group.bench_function("recursive_10", |b| {
b.iter(|| fibonacci(black_box(10)))
});
group.bench_function("iterative_10", |b| {
b.iter(|| fibonacci_iterative(black_box(10)))
});
group.finish();
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
Profiling with perf
# Compile with debug info
cargo build --release
# Run perf record
perf record -g ./target/release/my_program
# Analyze the results
perf report
Flamegraphs
# Install flamegraph
cargo install flamegraph
# Generate a flamegraph
cargo flamegraph
# Open the generated SVG file