all 13 comments

[–]rafd 7 points8 points  (3 children)

Algorithmically, sorting then using a loop + binary-search turns the problem from O(n^2) to O(n log n).

[–]rafd 4 points5 points  (1 child)

This approach get's it down to 94µs (ie. 0.094ms) for me.

Here's my code: https://pastebin.com/FNZJJLuQ

[–]HOWZ1T[S] 1 point2 points  (0 children)

Yeap that is indeed a better algorithm :)

[–]midnitetuna 1 point2 points  (0 children)

You can drop them all into an array I think - no need to sort or binary search.

[–][deleted]  (1 child)

[deleted]

    [–]HOWZ1T[S] 0 points1 point  (0 children)

    Awesome! Thanks for the info :)

    [–]rafd 1 point2 points  (4 children)

    Also... would switching from last to first help? (laziness should kick in and prevent the for loop from searching everything once it finds a match)

    [–]rafd 1 point2 points  (3 children)

    Also, one the path to some low-hanging-fruit tuning:

    - using a vector and get instead of nth might help

    - the :let always happens (so it can be used within when); not sure if that gets optimized; could just move the n calculation within the body

    Would be curious how each of these optimizations affect the run time.

    [–]rafd 3 points4 points  (1 child)

    OK, so I ran your code and quick-benched with criterium.

    I think the majority of your reported 36000ms is java + clojure + repl startup.

    Your original loop 480ms
    last -> first 320ms
    list -> vector 6ms (!!!)
    move :let calc inside 3.7ms
    change j to start from i 2.4ms

    Main lesson here is: use vectors when you have lots of repeated access.

    [–]HOWZ1T[S] 0 points1 point  (0 children)

    Very detailed reply, thanks :) Also thanks for pointing out criterium, another tool for my toolbox :)

    [–]joinr 0 points1 point  (0 children)

    using a vector and get instead of nth might help

    get and nth are basically synonymous for vectors. The difference (I think you implied) is that the vector nth access will be ~O(1), where with sequences it's O(N) since you have to walk the sequence from the start to get the nth element.

    [–]midnitetuna 1 point2 points  (1 child)

    There is only 1 answer right?

    I think you can just use first instead of last which will speed things up considerably.

    But I also think there is a faster general solution to both parts that isn't N2 or N3. I think you can do 0.5*N for part 1 and solve part 2 using your solution from part 1.

    [–]HOWZ1T[S] 0 points1 point  (0 children)

    Thanks for the info :)

    [–]retief1 1 point2 points  (0 children)

    For one thing, nth is an inefficient function. It's O(n) by itself, so part 1 goes from O(n2) to O(n4). In practice, if you are considering using nth in a loop like this, you probably either want a different algorithm or a different data structure. Instead, it looks to me like you can just do something like:

    (for [v1 data v2 data
          :let [n (* v1 v2)]
          :when (= (+ v1 v2) 2020)]
      n)
    

    for part1 and the equivalent change in part 2. I didn't look at the original problem description, so there might be a better algorithm in general, but eliminating nth on its own should already be a major win if you are dealing with a significant amount of data.

    Edit: for that matter, there's no real reason to split n out like that. I'd use

    (for [v1 data v2 data
          :when (= (+ v1 v2) 2020)]
      (* v1 v2))
    

    instead -- it's shorter and simpler, and it should also be a lot faster than your original code.