Introduction | Andrew Odendaal

Performance Optimization Fundamentals

Before diving into specific techniques, let’s establish some fundamental principles:

The Optimization Process

1. Measure - Establish a baseline and identify bottlenecks
2. Analyze - Understand why the bottlenecks exist
3. Improve - Make targeted changes to address the bottlenecks
4. Verify - Measure again to confirm improvements
5. Repeat - Continue until performance goals are met

Premature Optimization

As Donald Knuth famously said, “Premature optimization is the root of all evil.” Focus on writing clear, correct code first, then optimize where necessary:

// First version: Clear and correct
fn calculate_average(numbers: &[f64]) -> f64 {
    let sum: f64 = numbers.iter().sum();
    sum / numbers.len() as f64
}

// Optimized version (only after profiling shows it's needed)
fn calculate_average_optimized(numbers: &[f64]) -> f64 {
    let mut sum = 0.0;
    let len = numbers.len();
    
    // Manual loop unrolling for better performance
    let mut i = 0;
    while i + 4 <= len {
        sum += numbers[i] + numbers[i + 1] + numbers[i + 2] + numbers[i + 3];
        i += 4;
    }
    
    // Handle remaining elements
    while i < len {
        sum += numbers[i];
        i += 1;
    }
    
    sum / len as f64
}

Profiling and Benchmarking

Before optimizing, you need to identify where your code is spending time:

Benchmarking with Criterion

use criterion::{black_box, criterion_group, criterion_main, Criterion};

fn fibonacci(n: u64) -> u64 {
    match n {
        0 => 0,
        1 => 1,
        _ => fibonacci(n - 1) + fibonacci(n - 2),
    }
}

fn fibonacci_iterative(n: u64) -> u64 {
    if n <= 1 {
        return n;
    }
    
    let mut a = 0;
    let mut b = 1;
    
    for _ in 2..=n {
        let temp = a + b;
        a = b;
        b = temp;
    }
    
    b
}

fn criterion_benchmark(c: &mut Criterion) {
    let mut group = c.benchmark_group("fibonacci");
    
    group.bench_function("recursive_10", |b| {
        b.iter(|| fibonacci(black_box(10)))
    });
    
    group.bench_function("iterative_10", |b| {
        b.iter(|| fibonacci_iterative(black_box(10)))
    });
    
    group.finish();
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

Profiling with perf

# Compile with debug info
cargo build --release
# Run perf record
perf record -g ./target/release/my_program
# Analyze the results
perf report

Flamegraphs

# Install flamegraph
cargo install flamegraph
# Generate a flamegraph
cargo flamegraph
# Open the generated SVG file

Continue Your Learning

This is part 1 of 5 in the comprehensive guide.

Guide Overview See all 5 parts Next → Fundamentals and Core Concepts