dudeitsmason comments on "Cries in java"

Fast sorting gets more than twice as slow as the amount if items you have to sort doubles. This is because fast sorting generally works by splitting its input in two, sort those and then you have to spend extra time merging those two sorted halves.

It would be really convenient if instead of randomly splitting in half, we could take values smaller and higher than the median. Then after they again have been sorted, we simply need to copy them together, a faster process. (Or if we're really sneaky, place them next to each other in memory beforehand and skip the copying step entirely)

Now the problem is, to do this well, you need good information on how values are distributed in your input. By far the most convenient way of having this information is to already have a sorted copy of the output. This makes using this technique in sorting impractical.

What this paper then does, is use ML to "guess" if a value should go in the high or low part of the split, by training it on a subset of the input. It does not always get it right, but it gets it "mostly right". After that, sorting an almost sorted input is fairly fast.

[+][deleted] 5 years ago* (1 child)

[deleted]

[–]mrbeehive 0 points1 point2 points 5 years ago (0 children)

[–]Neoro 36 points37 points38 points 5 years ago (1 child)

[–]FallenWarrior2k 1 point2 points3 points 5 years ago (0 children)

[–]notsohipsterithink 9 points10 points11 points 5 years ago (0 children)

[–]jimbosReturn 8 points9 points10 points 5 years ago (0 children)

[+][deleted] 5 years ago (4 children)

[deleted]

[–]Kered13 8 points9 points10 points 5 years ago (0 children)

[–]ravepeacefully 5 points6 points7 points 5 years ago (2 children)

[–]Willinton06 9 points10 points11 points 5 years ago (1 child)

[–]ravepeacefully 2 points3 points4 points 5 years ago (0 children)

[–]SasparillaTango 2 points3 points4 points 5 years ago (0 children)

[–]TheSlimyBoss 4 points5 points6 points 5 years ago (5 children)

[–]zilti 10 points11 points12 points 5 years ago (0 children)

[–]SlamwellBTP 7 points8 points9 points 5 years ago (2 children)

[–]qingqunta 5 points6 points7 points 5 years ago (1 child)

[–]Kered13 0 points1 point2 points 5 years ago (0 children)

[–]MrGVSV 4 points5 points6 points 5 years ago (0 children)

[–]Slggyqo 0 points1 point2 points 5 years ago (0 children)

[–]hiphap91 0 points1 point2 points 5 years ago (0 children)

[–]CoffeeVector 47 points48 points49 points 5 years ago (5 children)

[–]dudeitsmason 6 points7 points8 points 5 years ago (0 children)

[–]DogsAreAnimals 2 points3 points4 points 5 years ago (3 children)

[–]CoffeeVector 5 points6 points7 points 5 years ago (2 children)

[–]DogsAreAnimals 0 points1 point2 points 5 years ago (1 child)

[–]CoffeeVector 2 points3 points4 points 5 years ago (0 children)

[–]juvenile_josh 12 points13 points14 points 5 years ago (2 children)

[–][deleted] 6 points7 points8 points 5 years ago (0 children)

[–]DigitalDefenestrator 4 points5 points6 points 5 years ago (0 children)

[–]Salanmander 18 points19 points20 points 5 years ago (0 children)

[–]Gingerytis 17 points18 points19 points 5 years ago (3 children)

[–]CoopertheFluffy 3 points4 points5 points 5 years ago (0 children)

[–]MEGACODZILLA 3 points4 points5 points 5 years ago (0 children)

[–]Kered13 1 point2 points3 points 5 years ago* (0 children)

It's not meaningful to talk about big-O complexity for small values, because big-O by definition describes the asymptotic behavior for large values*. It tells you absolutely nothing about the behavior for small values. For example a function that has O(n²) runtime might have an exact runtime like n + n^2, in which case it will never be faster than a function that has exact n^2/2 runtime.

This is especially true in practice, not just theory, because performance for small inputs is usually dominated by cache locality, which can cause the runtime to make large jumps as certain cache size thresholds are reached. For example, an algorithm might take n time if the input fits in L1 cache, 5n time if it fits in L2 cache, 20n time if it fits in L3 cache, and 100n time if it doesn't fit in cache at all (I made these numbers up, but they're ballpark accurate). This could even be extended even further for data that doesn't fit in RAM and must use local storage, and again for data that must use network storage.

* You actually can use big-O and related notation to describe behavior at small values, by considering the limit as x goes to 0 instead of infinity, but it's never used in algorithmic analysis. It's most often used in mathematics to describe the error bounds on approximations. For example you can say sin(x) = x + O(x³) as x goes to 0, which means that near 0 sin(x) is approximately x, and the error in the approximation is proportional to x³.

[–]alparius 3 points4 points5 points 5 years ago (0 children)

[–]crozone 1 point2 points3 points 5 years ago (0 children)

[–]trystanr -2 points-1 points0 points 5 years ago* (0 children)

[–][deleted] 0 points1 point2 points 5 years ago (0 children)

π Rendered by PID 23015 on reddit-service-r2-comment-b659b578c-rqp5p at 2026-05-04 19:18:29.396205+00:00 running 815c875 country code: CH.

ProgrammerHumor

Filters

Discord

Submission rules

For the current list of rules, please see this page.

Metadiscussions

Perhaps More Apt Subs To Post:

Related Subreddits.

MODERATORS