Skip to content

Commit 1e4e058

Browse files
committed
Histogram spec: Rework and amend operators section
- Divide operators section into sub-sections and reformat accordingly. - Add that negative results of multiplication count as gauge histogram. - Correct and complete the sub-section about trim operators. Signed-off-by: beorn7 <beorn@grafana.com>
1 parent 9837e9c commit 1e4e058

File tree

1 file changed

+75
-18
lines changed

1 file changed

+75
-18
lines changed

docs/specs/native_histograms.md

Lines changed: 75 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1833,17 +1833,21 @@ the reason why the level of the annotation is only info.)
18331833

18341834
The following describes all the operations that actually _do_ work.
18351835

1836+
#### Addition and subtraction
1837+
18361838
Addition (`+`) and subtraction (`-`) work between two compatible histograms.
18371839
These operators add or subtract all matching bucket populations and the count
18381840
and the sum of observations. Missing buckets are assumed to be empty and
1839-
treated accordingly. Generally, both operands should be gauges. Adding and
1840-
subtracting counter histograms requires caution, but PromQL allows it. Adding a
1841-
gauge histogram and a counter histogram results in a gauge histogram. Adding
1842-
two counter histograms results in a counter histogram. If the two operands
1843-
share the same counter reset hint, the resulting counter histogram retains that
1844-
counter reset hint. Otherwise, the resulting counter reset hint is set to
1845-
`UnknownCounterReset`. The result of a subtraction is always marked as a gauge
1846-
histogram because it might result in negative histograms, see [notes
1841+
treated accordingly.
1842+
1843+
Generally, both operands should be gauges. Adding and subtracting counter
1844+
histograms requires caution, but PromQL allows it. Adding a gauge histogram and
1845+
a counter histogram results in a gauge histogram. Adding two counter histograms
1846+
results in a counter histogram. If the two operands share the same counter
1847+
reset hint, the resulting counter histogram retains that counter reset hint.
1848+
Otherwise, the resulting counter reset hint is set to `UnknownCounterReset`.
1849+
The result of a subtraction is always marked as a gauge histogram because it
1850+
might result in negative histograms, see [notes
18471851
above](#unary-minus-and-negative-histograms). Adding or subtracting two counter
18481852
histograms with directly contradicting counter reset hints (i.e. `CounterReset`
18491853
and `NotCounterReset`) triggers a warn-level annotation. (TODO: As described
@@ -1853,14 +1857,20 @@ circumstances involving the `HistogramStatsIterator`, which includes additional
18531857
counter reset tracking. See [tracking
18541858
issue](https://github.com/prometheus/prometheus/issues/15346).)
18551859

1860+
#### Multiplication
1861+
18561862
Multiplication (`*`) works between a float sample or a scalar on the one side
18571863
and a histogram on the other side, in any order. It multiplies all bucket
18581864
populations and the count and the sum of observations by the float (sample or
18591865
scalar). This will lead to “scaled” and sometimes even negative histograms,
18601866
which is usually only useful as intermediate results inside other expressions
1861-
(see also [notes above](#unary-minus-and-negative-histograms)). Multiplication
1862-
works for both counter histograms and gauge histograms, and their flavor is left
1863-
unchanged by the operation.
1867+
(see also [notes above](#unary-minus-and-negative-histograms)).
1868+
1869+
Multiplication works for both counter histograms and gauge histograms, and
1870+
their flavor is left unchanged by the operation, with the exception that a
1871+
negative histogram is always considered to be a gauge histogram.
1872+
1873+
#### Division
18641874

18651875
Division (`/`) works between a histogram on the left hand side and a float
18661876
sample or a scalar on the right hand side. It is equivalent to multiplication
@@ -1870,30 +1880,77 @@ and sum of observations all set to `+Inf`, `-Inf`, or `NaN`, depending on their
18701880
values in the input histogram (positive, negative, or zero/`NaN`,
18711881
respectively).
18721882

1883+
#### Equality and inequality
1884+
18731885
Equality (`==`) and inequality (`!=`) work between two histograms, both in
18741886
their filtering version as well as with the `bool` modifier. They compare the
18751887
schema, the custom values, the zero threshold, all bucket populations, and the
18761888
sum and count of observations. Whether the histograms have counter or gauge
18771889
flavor is irrelevant for the comparison. (A counter histogram could be equal to
18781890
a gauge histogram.)
18791891

1892+
#### Logical and set operators
1893+
18801894
The logical/set binary operators (`and`, `or`, `unless`) work as expected even
18811895
if histogram samples are involved. They only check for the existence of a
18821896
vector element and don't change their behavior depending on the sample type or
18831897
flavor of an element (float or histogram, counter or gauge).
18841898

1885-
The “trim” operators `>/` and `</` were introduced specifically for native
1899+
#### Trim operators
1900+
1901+
The “trim” operators `</` and `>/` were introduced specifically for native
18861902
histograms. They only work for a histogram on the left hand side and a float
18871903
sample or a scalar on the right hand side. (They do not work for float samples
18881904
or scalars on _both_ sides. An info-level annotation is returned in this case.)
1905+
18891906
These operators remove observations from the histogram that are greater or
18901907
smaller than the float value on the right side, respectively, and return the
1891-
resulting histogram. The removal is only precise if the threshold coincides
1892-
with a bucket boundary. Otherwise, interpolation within the affected buckets
1893-
has to be used, as described [above](#interpolation-within-a-bucket). The
1894-
counter vs. gauge flavor of the histogram is preserved. (TODO: These operators
1895-
are not yet implemented and might also change in detail, see [tracking
1896-
issue](https://github.com/prometheus/prometheus/issues/14651).)
1908+
resulting histogram.
1909+
1910+
The removal is only precise if the threshold coincides with a bucket boundary.
1911+
Otherwise, interpolation within the affected buckets has to be used, as
1912+
described [above](#interpolation-within-a-bucket). All observations in buckets
1913+
with any limit of positive or negative infinity are considered to be of
1914+
positive or negative infinity, respectively, for the purpose of this
1915+
interpolation. (In the pathologic edge case where a histogram has only a single
1916+
bucket with a lower limit of -Inf and an upper limit of +Inf, all observations
1917+
are considered to be of value zero.)
1918+
1919+
If any observations have been removed, the sum of all observations in the
1920+
resulting histogram is estimated from the remaining buckets. The value of each
1921+
observation in a given bucket is estimated to be the following:
1922+
1923+
- The upper bound of the lowest bucket of an NHCB if this upper bound is
1924+
negative or zero.
1925+
- The lower bound for the overflow bucket of an NHCB.
1926+
- -Inf for the negative overflow bucket of a standard exponential histogram.
1927+
- +Inf for the positive overflow bucket of a standard exponential histogram.
1928+
- The arithmetic mean for all other buckets of an NHCB and for the zero bucket
1929+
of a standard exponential histogram, taking into account the known heuristics
1930+
for the following special cases:
1931+
- The lowest bucket of an NHCB is considered to have a lower bound of zero if
1932+
its upper bound is positive.
1933+
- The lower bound of the zero bucket of a standard exponential histogram is
1934+
considered to be zero if the histograms has no populated negative buckets.
1935+
- The upper bound of the zero bucket of a standard exponential histogram is
1936+
considered to be zero if the histograms has no populated positive buckets.
1937+
- The geometric mean for all other buckets of a standard exponential histogram.
1938+
1939+
The (arithmetic or geometric) mean calculation for the bucket that was only
1940+
partially removed (using the interpolation described above) is modified in the
1941+
following way: The relevant upper bound (for `</`) or lower bound (for `>/`) is
1942+
considered to be equal to the cutoff float value provided as the 2nd operand.
1943+
1944+
Note that this estimation of the sum of observations is inaccurate, up to a
1945+
point where it could yield results that are obviously wrong. For example, after
1946+
removing some positive observation, the estimated sum of observations could be
1947+
larger than the sum of observations in the original histogram. The estimation
1948+
algorithm could be refined, but it is kept deliberately simple to make it
1949+
easier to reason with. In general, the histograms resulting from trim
1950+
operations are meant to be used for quantile estimation. Their sum of
1951+
observations is considered to be of limited use.
1952+
1953+
The trim operators preserve the counter vs. gauge flavor of the histogram.
18971954

18981955
### Aggregation operators
18991956

0 commit comments

Comments
 (0)