The mature optimization handbook

Published 2021-02-16

http://carlos.bueno.org/optimization/

'The Mature Optimization Handbook' is about monitoring and profiling continuous systems. Reasonable ideas but not particularly dense.

The performance problem definition must be falsifiable

Use performance measurements to try to falsify theory, not just confirm it

A measurement is a number obtained during some profiling event

Metadata are attributes of the system or profiling event

A sample is a collection of measurements and metadata relating to a single event

A metric is a statement about a set of samples, typically an aggregation

Continuous systems need to be measured in production

Store recent samples in a flat table in RAM, so we can ask unexpected questions

Store old samples as aggregated metrics, to save space

Measurement systems need to be tested by sanity checking and independent confirmation

The dimensions of performance measurements are usually time, space and instructions

Record time measurements in u64 microseconds

Record space measurements in u64 bytes

Record instruction measurements in u64 kilo-instructions (because >1000 instructions per microsecond, might overflow)

The main visualization needs for monitoring are raw data (in a table), time-series, histogram and scatter

Any visualization a human ever looked at is probably important enough for a permalink

Design monitoring dashboards by asking while the system is operating normally, the ___ graph should never ___

Design diagnosis tools by asking when the ___ is operating abnormally, the ___ graph can eliminate ___ as a possible cause

Localize performance anomalies by recursively subdividing on different dimensions

Choose alarm thresholds by comparing against historical data