#!/usr/bin/python3 # -*- coding: utf-8 -*- """Extremely simple RMQ algorithm with expected logarithmic time on random data. In linear time, this algorithm builds an RMQ index structure that is linear in the size as the original data (a parallel array of indices); it can then provide the index of the minimum of any range of any size of the original data in logarithmic average time (though linear time in the worst case). In itself, this is unsurprising, since a balanced tree over the input data indices in which each node stores its minimum can provide the same asymptotic performance, except that its worst case is logarithmic rather than linear; but this algorithm is much, much simpler. Indeed, it’s unbelievably simple. Surely someone has found it before, though I don’t remember having seen it discussed in discussions of RMQ, which is a popular contest problem. This is an adaptation of the ascending minima algorithm for *sliding* RMQ, which I think is due to Richard Harter: This algorithm can be used for efficient computation of greyscale operations of mathematical morphology, such as dilation, erosion, opening, and closing, with arbitrarily shaped structuring elements. While the popular van Herk/Gil–Werman algorithm requires a separate precomputation for each kernel size, this algorithm (like Urbach and Wilkinson’s 2008 chord-table algorithm, doi 10.1.1.442.4549, “Efficient 2-D Grayscale Morphological Transformations With Arbitrary Flat Structuring Elements”) can be used to efficiently compute dilations and erosions along arbitrary subsets of horizontal rasters. I haven’t even put together an efficient implementation of this algorithm yet, much less benchmarked it against the chord-table algorithm, which I still don’t understand. """ import random def rmqi(xs): """Compute the RMQ predecessor pointers for xs in linear time. Given j = list(rmqi(xs))[i] for any i, it should be the case that xs[j] < xs[i] and ∀k ∈ (j, i): xs[k] >= xs[i]; or, iff ∀k ∈ [0, i): xs[k] >= xs[i], then j is None. This is a unique specification of rmqi; exactly one sequence of indices can satisfy it. This is linear time for the same reason as the ascending minima algorithm. """ js = [None] * len(xs) for i in range(1, len(xs)): js[i] = i-1 while js[i] is not None and xs[js[i]] >= xs[i]: js[i] = js[js[i]] return js iterations = 0 def rmq(js, i, j): """RMQ: Return some k ∈ [i, j) such that ∀n ∈ [i, j): xs[n] >= xs[k]. j must be at least i+1, j must be <= len(xs), i must be < len(xs), and js must be rmqi(xs). This is known as the “range minimum query” problem. This function should take logarithmic time on random data; for example, it takes about 4–5 iterations on random ranges of 1000 random items, about 6–8 iterations on random ranges of 10000 random items, and about 9–10 iterations on random ranges of 100'000 random items. The random ranges are taken between two distinct uniformly distributed indices (see the ``test_rmq`` function below), so their average size is ⅓ of the total number of items. That is, I have intuitive and empirical evidence that it’s logarithmic-time, but no proof. Of course, it takes linear time in the worst case — data in ascending order. """ global iterations k = j-1 while js[k] is not None and js[k] >= i: iterations += 1 k = js[k] return k def test_rmqi(xs): js = rmqi(xs) assert len(js) == len(xs) for i, x in enumerate(xs): j = js[i] if j is None: assert all(xk >= x for xk in xs[:i]) else: assert xs[j] <= x assert all(xk >= x for xk in xs[j+1:i]) def test_rmq(n, m, p): xs = [random.randrange(n) for i in range(m)] js = rmqi(xs) for _ in range(p): i = random.randrange(len(xs)) while True: j = random.randrange(len(xs)) if i != j: break if i > j: i, j = j, i k = rmq(js, i, j) assert i <= k < j, (i, j, k) assert all(xs[n] >= xs[k] for n in range(i, j))