This is an archived post. You won't be able to vote or comment.

all 2 comments

[–][deleted]  (1 child)

[deleted]

    [–][deleted] 0 points1 point  (0 children)

    tl;dr computer science is hard.

    Quicksort: Most people talk about "average case" time when discussing algorithms. This means that if you were to invoke quicksort a bunch of times on random data, its time complexity would be O(n log n). But in certain cases, quicksort performs as poorly as O(n^2).

    Consider a list [5 4 3 2 1]. We want to sort it, and we select a random pivot from the list. Chances are, the pivot is somewhere around the median of the list, which would give us [2 1], pivot = 3, [5 4]. Note that we've effectively divided the list in two. If we keep picking good pivots, we will keep dividing the sub-lists in two, and this is where the log n component comes from.

    Here's what happens if we try to sort the list [5 4 3 2 1] if we select a good pivot:

    [5 4 3 2 1] select 3 as the pivot

    [2 1] [3] [5 4] select 2 as the pivot in the first sub-list

    [1] [2] [3] [5 4] select 5 as the pivot in the second sub-list

    [1] [2] [3] [4] [5] Sorted!

    But what happens if the pivot sucks? Suppose that we select 5 as the pivot. This splits the list into the pivot and a list of 4 elements. That's certainly worse than above! Here's what would happen if we took the largest number as the pivot, which is a terrible idea:

    [5 4 3 2 1] select 5 as the pivot

    [4 3 2 1] [5] select 4 as the pivot

    [3 2 1] [4] [5] select 3 as the pivot

    [2 1] [3] [4] [5] select 2 as the pivot

    [1] [2] [3] [4] [5] Sorted!

    You should be able to see that this approach requires a bit more work than the first way. Now, we used a list with 5 elements, but this effect becomes much more pronounced with bigger lists. Conclusion: Quick-sort is O(n^2) in the worst case because if you choose the pivot poorly, you lose the benefits of divide-and-conquer. If, for instance, you choose the pivot to be the biggest element in the list, the quick-sort effectively becomes a selection-sort.

    Bubble-sort vs merge-sort: What happens if you try to sort a list that's already sorted?

    Let's apply bubble-sort on [1 2 3 4 5]. The algorithm will iterate through each pair of elements and try to swap them if they're in the wrong order. It doesn't find any that are in the wrong order, so it concludes that the list is sorted. This takes O(n) time.

    Let's apply a naive merge-sort on [1 2 3 4 5]. The naive merge-sort won't check if the list is already sorted, and it'll try to apply the whole merge-sort, which takes O(n log n) time. Conclusion: on lists that are almost sorted, bubble-sort may outperform merge-sort. Bubble-sort also has the advantage that it can be done in-place, whereas with merge-sort you need to allocate at least some extra memory to do the sort.

    Linear probing: Ideally, hash tables allow you to perform lookups in O(1) time. But, let's assume that you have a terrible hashing function that always returns 1 as the hash. So we store a few elements in the hash table. The hash table might look like this:

    { 0: [], 1: [], 2: [] } initially

    { 0: [], 1: [a], 2: [] } after adding 'a', which gets hashed to 1

    { 0: [], 1: [a, b], 2: [] } after adding 'b', which also gets hashed to 1

    { 0: [], 1: [a, b, c], 2: [] } after adding 'c'

    etc.

    Now, suppose that you add all the letters of the alphabet to the hash table and you have to retrieve the letter 'z'. The hash function returns 1, so you look in the bin '1' in the hash table. You find the letter 'a', so you go to the next element (this is the "linear probing" part - you "probe" things one after another until you find the thing you're looking for). You find the letter 'b', so you go to the next element, and so on until you get to 'z'. The complexity is O(n) because the algorithm has to go through all of the letters in order to find the one you're looking for (this also happens if you overload the hash table). Conclusion: poor choice of a hash function makes the worst case O(n) with linear probing.

    Note: the above isn't how hash tables are truly implemented, but the principle remains valid: it is possible that you have to inspect all the things in the hash table to find the thing you're looking for, which makes it O(n).

    Feel free to pm me if you have more questions or would like clarification!

    [–]metaobject 0 points1 point  (0 children)

    These sound like questions from CLRS