πŸ„ΈπŸ…πŸ…€β€


Ceph Perf Histograms ↦ Prometheus

Resources

Prometheus Histogram Format

One dimensional. Consist of 3 parts:

  • Buckets. Defined as an upper bound including all previous buckets. Each line represents - ∞ < x ≀ boundary.
  • Sum of all sample values.
  • Sample count which is equal by definition to the value of the +Inf bucket.
# A histogram, which has a pretty complex representation in the text format:
# HELP http_request_duration_seconds A histogram of the request duration.
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.05"} 24054
http_request_duration_seconds_bucket{le="0.1"} 33444
http_request_duration_seconds_bucket{le="0.2"} 100392
http_request_duration_seconds_bucket{le="0.5"} 129389
http_request_duration_seconds_bucket{le="1"} 133988
http_request_duration_seconds_bucket{le="+Inf"} 144320
http_request_duration_seconds_sum 53423
http_request_duration_seconds_count 144320

(Example from github/prometheus)

Defines the buckets:

  1. b1 ≀ 0.05
  2. b2 ≀ 0.1
  3. b3 ≀ 0.2
  4. b4 ≀ 0.5
  5. b5 ≀ 1
  6. b6 ≀ ∞ or 1 < b6

Ceph Perf Histograms

Two dimensional with latency and size being common axis. Bucket counters are exclusive. There is no sum of all values.

The docs mention that histograms are typed (float, uint64, time, etc.), but PerfCounterBuilder only supports a uint64 counter histogram.

Axis

PerfHistogramCommon::axis_config_d perfcounter_op_hist_x_axis_config{
    "Latency (Β΅s)",
    PerfHistogramCommon::SCALE_LOG2, // Latency in logarithmic scale
    100,                             // Start
    900,                             // Quantization unit
    18,                              // buckets
};

Buckets can’t be defined explicitly. They are defined by start, quantization unit and scale. Formula is start + scale factor * quantization unit. Scale factor is 2^n or n. I use a spreadsheet to find good values.

An axis can also have just one bucket:

PerfHistogramCommon::axis_config_d perfcounter_op_hist_y_axis_config{
    "Count", PerfHistogramCommon::SCALE_LINEAR, 0, 1, 1,
};

In this case the histogram becomes one dimensional. Every bucket of the first axis will have a single counter.

Common theme is latency X size though:

  PerfHistogramCommon::axis_config_d op_hist_y_axis_config{
    "Request size (bytes)",
    PerfHistogramCommon::SCALE_LOG2, ///< Request size in logarithmic scale
    0,                               ///< Start at 0
    512,                             ///< Quantization unit is 512 bytes
    32,                              ///< Enough to cover requests larger than GB
  };

(From osd_perf_counters.cc)

Conversion

Option 1: 1D + separate time counter

Use a pseudo second dimension (see one bucket axis above) perf histogram and a separate counter to track the sum. The definition could look like this:

PerfHistogramCommon::axis_config_d perfcounter_op_hist_x_axis_config{
    "Latency (Β΅s)",
    PerfHistogramCommon::SCALE_LOG2, // Latency in logarithmic scale
    100,                             // Start
    900,                             // Quantization unit
    18,                              // buckets
};

PerfHistogramCommon::axis_config_d perfcounter_op_hist_y_axis_config{
    "Count", PerfHistogramCommon::SCALE_LINEAR, 0, 1, 1,
};

.add_u64_counter_histogram(i, rgw_op_type_str(static_cast<RGWOpType>(i)),
                           perfcounter_op_hist_x_axis_config,
					       perfcounter_op_hist_y_axis_config,
					       "Histogram of operation service time in Β΅s");

.add_time(i, rgw_op_type_str(static_cast<RGWOpType>(i)));

Conversion to Prometheus like this:

uint64_t count = 0;
for (int64_t bucket_no = 0; bucket_no < ac.m_buckets; bucket_no++) {
  std::vector<std::string> bucket_labels(labels);
  bucket_labels.emplace_back(
      (bucket_no == ac.m_buckets - 1)
          ? "le=\"+Inf\""
          : fmt::format("le=\"{}\"",
                        std::max(0L, (ac.m_min +
                                      get_quants(bucket_no,
                                                 ac.m_scale_type) *
                                          ac.m_quant_size) -
                                         1)));
  count += data.histogram->read_bucket(bucket_no, 0);
  fmt::print(os, "{name}_bucket{labels} {value}\n", "name"_a = name,
             "labels"_a = format_labels(bucket_labels),
             "value"_a = count);
}

Iterate buckets according to the axis config. Keep a counter to make the buckets inclusive.

Printing the _sum line is a bit tricky. We need to find the matching counter and convert that to the same unit as the buckets are in. In the example add_time defines a time counter that uses utime_t. The histogram counters are uint64 with buckets in Β΅s.

Option 2: 2D: Turn dimension into labeled histogram

For every bucket B of dimension a: Print histogram of dimension b with additional label B For (duration x size) histogram this yields size labeled histograms:

# HELP http_request_duration_seconds A histogram of the request duration.
# TYPE http_request_duration_seconds histogram

http_request_duration_seconds_bucket{size="1024", le="0.05"} 24054
http_request_duration_seconds_bucket{size="1024", le="0.1"} 33444
http_request_duration_seconds_bucket{size="1024", le="0.2"} 100392
http_request_duration_seconds_bucket{size="1024", le="0.5"} 129389
http_request_duration_seconds_bucket{size="1024", le="1"} 133988
http_request_duration_seconds_bucket{size="1024", le="+Inf"} 144320
http_request_duration_seconds_count{size="1024"} 144320

...

http_request_duration_seconds_bucket{size="+Inf", le="0.05"} 24054
http_request_duration_seconds_bucket{size="+Inf", le="0.1"} 33444
http_request_duration_seconds_bucket{size="+Inf", le="0.2"} 100392
http_request_duration_seconds_bucket{size="+Inf", le="0.5"} 129389
http_request_duration_seconds_bucket{size="+Inf", le="1"} 133988
http_request_duration_seconds_bucket{size="+Inf", le="+Inf"} 144320
http_request_duration_seconds_count{size="+Inf"} 144320

Option 3: 2D into 1D

When printing a dimension, sum up all counts of the other dimension and print that as the bucket value.

This is basically printing the “+Inf” labeled histograms as values:

http_request_duration_seconds_bucket{le="0.05"} $http_request_duration_seconds_bucket{size="+Inf", le="0.05"}
http_request_duration_seconds_bucket{le="0.1"} $http_request_duration_seconds_bucket{size="+Inf", le="0.1"}
http_request_duration_seconds_bucket{le="0.2"} $http_request_duration_seconds_bucket{size="+Inf", le="0.2"}
http_request_duration_seconds_bucket{le="0.5"} $http_request_duration_seconds_bucket{size="+Inf", le="0.5"}
http_request_duration_seconds_bucket{le="1"} $http_request_duration_seconds_bucket{size="+Inf", le="1"}
http_request_duration_seconds_bucket{le="+Inf"} $http_request_duration_seconds_bucket{size="+Inf", le="+Inf"}
http_request_duration_seconds_count $http_request_duration_seconds_count{size="+Inf"}