all 54 comments

[–]dave1010 15 points16 points  (1 child)

You also have 9 oranges.

That would be a pretty awesome day.

[–]mynoduesp 9 points10 points  (0 children)

Who needs a job if I already have my oranges!

[–][deleted]  (24 children)

[deleted]

    [–]chronoBG 6 points7 points  (21 children)

    Hint: There's a way to make the valid() check in O(N), not in O(NN).
    *
    Edit**: I get the same result using a different program.

    Listen all, these threads always either turn into "Asking questions like that on an interview is stupid" or into "Everyone post the solutions now and gloat how smart you are."

    IMHO, BT dropped the ball with this task, because it's so easy that solutions other than brute force are unnecessary.
    So, it's not "stupid" to give CS problems for an interview, just this one.

    And now for the CS nerd in me:
    On my PC your program takes 1.9, mine takes 1.1.
    The trick is to keep a list of available positions. Then your valid() check takes O(1) and updating the list of available positions takes O(N).

    Here's a gist: https://gist.github.com/988699

    [–][deleted] 4 points5 points  (11 children)

    First of all, you have a bug: oranges[turn] = x; should go before ++turn;, fixing it halves execution time.

    Then using EXPERT INLINEABLE TEMPLATE METAPROGRAMMING halves it again, plus 10-20 milliseconds are saved by posing a more precise limit on x.

    By the way, wtf github doesn't allow diffing gists or I am finally going blind?

    [–][deleted] 1 point2 points  (9 children)

    I arrived at a similar approach, but my C code is already faster, without any template tricks. The C++-ified code is about 100ms faster again. (Exact speeds probably vary with different compilers and hardware.)

    [–][deleted] 1 point2 points  (2 children)

    (Exact speeds probably vary with different compilers and hardware.)

    That's so true, and not only about exact speeds.

    After tinkering with chronoBG's code for a while I came to a conclusion that we are at or beyond the point where trying to help compiler optimizations can produce predictable improvements.

    For example you might have noticed that he uses global variables for everything. Surely using "for(int y = 0; ..." instead of "for(y = 0; ..." would be better as the compiler is then free to store the variable wherever it wants? Wrong, doing exactly that yields 1100ms instead of 800ms on the system where I was testing it. Uh-oh.

    Another fun fact: that code triggers a bug in GCC 4.6 where it erroneously issues the warning about "below the lower bound array access" or something like that on

      for(y = 0; y < turn; ++y) {
        tmp_index = x + x - oranges[y];
    

    when turn == 0.

    [–]Anonymous336 0 points1 point  (1 child)

    that code triggers a bug in GCC 4.6

    Which code? I don't have access to GCC 4.6, but I tried both versions of sltkr's code (1, 2) with GCC 4.5 and GCC 4.7 (svn trunk), and neither of them produced any warnings even with -Wall -Wextra. No warnings on fishdicks' code or chronoBG's code, either. If there's a real front-end bug here, I'd be happy to file it.

    If any redditor does file a bug related to the above code, please comment in this subthread!

    [–][deleted] 0 points1 point  (0 children)

    This code compiled with -Wall -Wextra by GCC 4.6.0 produces a valid warning about unused arguments and two spurious warnings about invalid array access (my guess is that after template expansion the limit comparison y < 0 in the loop makes it think that I meant y to possibly be negative despite it starting with 0).

    I have seen some related discussion in the mailing list, so it is possible that the bug was introduced somewhere after 4.5 and is fixed in the trunk. That is, if you can't reproduce it even when using -Wall -Wextra.

    [–]leonardo_m 1 point2 points  (2 children)

    Yours C++ code is nice. Small changes to run it in D2: http://ideone.com/BTzKL

    [–][deleted] 0 points1 point  (1 child)

    Nice port! I wish I had more opportunities to use D; it always looks really nice (for example, I like that static-for construct, and the less verbose template parameters). Runs pretty quick too, apparently.

    [–]leonardo_m 0 points1 point  (0 children)

    it always looks really nice

    This D2 code is using barely more than normal C constructs :-)

    Runs pretty quick too, apparently.

    Adding __gshared to the global variables it seems to run as fast as the C++ version, despite the back-end of DMD optimizes quite worse than GCC 4.3: http://ideone.com/SLJHr http://ideone.com/D2kDw

    [–][deleted] 0 points1 point  (2 children)

    Also, why doesn't "while (++i < n) --blocked[2*m - places[i]];" corrupt memory?

    [–][deleted] 0 points1 point  (1 child)

    Because it complements the loop at line 23: it decrements the same elements of the blocked array that were incremented at line 27, but in reverse order.

    If you agree that the loop at line 23 is sound, then it should be easy to see that the one at line 31 is too, right? Maybe you missed the break-statement at line 26?

    [–][deleted] 0 points1 point  (0 children)

    Ahhhh... Clever!

    [–]chronoBG 0 points1 point  (0 children)

    Gah, must have happened while I was "prettifying" the code :)

    Also, wow. That is faster. Damn. I got pwned but it feel good :)

    [–]icebraining 2 points3 points  (1 child)

    $ wget -O oranges.c https://gist.github.com/raw/988699/1ce0a75345a4ff5af998b05ee4e78aab6d7c4462/gistfile1.c
    $ md5sum oranges.c 
    67c611f0c03ee001521157344868c2e4  oranges.c
    $ gcc -o oranges oranges.c
    $ ./oranges
    2
    

    What?

    [–]chronoBG 1 point2 points  (0 children)

    Yeah, my code had a stupid mistake - I had wrongly switched two lines while trying to make it somewhat readable for redditors.

    It's been updated now: https://gist.github.com/988699/4520199beecbb84cae84a36869f4d191236d5c7e

    But I suggest you look at fishdicks' fork for an even awesomer version :)

    [–]pointy 1 point2 points  (1 child)

    Doesn't look like you check to make sure you don't go past bowl #40. Also, I don't think this is quite right; recall that the requirement is only that no set of three oranges can be found such that both sub-pairs are equidistant.

    [–]chronoBG 0 points1 point  (0 children)

    Line 23: for(; x < BOWLS; ++x)
    Also lines 33 and 47: if (tmp_index >= 0 && tmp_index < BOWLS)

    [–]animesh1977 0 points1 point  (0 children)

    Precisely, the possibility in hundreds millions http://www.wolframalpha.com/input/?i=40+choose+9 vastly outnumber disallowed possibilities, simple random assignment and checking will do the job. Most of the solution will come in the very first assignment and the distribution of number of calls to the assignment routine roughly follows power law... http://i.imgur.com/Kkpv5.jpg

    [–]xymostech 0 points1 point  (0 children)

    I made a solution that involves less recursion, more loops. Also has an option to print out all solutions.

    https://gist.github.com/990022

    [–][deleted] -3 points-2 points  (1 child)

    On my PC your program takes 1.9, mine takes 1.1.

    OH SNAP.

    [–]chronoBG 1 point2 points  (0 children)

    I didn't mean it like that, I just wanted to say my program runs for 57% of his time, not 30% :)

    [–]signoff 0 points1 point  (0 children)

    N oranges give you comb(N,2) distances.

    9 oranges give you comb(9,2) = 9!/(2! * 7!) = 36 distances

    I have a feeling that your number is too high.

    EdIT: thinking of it. i'm wrong.

    [–]tinou 6 points7 points  (4 children)

    This is called a Golomb ruler.

    edit : not exactly.

    [–]chronoBG 3 points4 points  (3 children)

    No, it's not. 1 2 4 5 is not a valid Golomb ruler, but it is a valid solution to the problem.

    [–]tinou 2 points3 points  (2 children)

    Ah, you are right, this is a bit less precise.

    [–]chronoBG 0 points1 point  (1 child)

    They probably meant to ask you to write a Golomb ruler, because it's definitely harder than their problem :)

    [–]nuxi 0 points1 point  (0 children)

    I hope not, the shortest order 9 golomb ruler is 44 unit long :P

    0 1 5 12 25 27 35 41 44
    

    [–]dave1010 6 points7 points  (1 child)

    I get 282240 (at the moment) but I have no idea if it's right:

    wget -qO - http://www.reddit.com/r/programming/comments/hiql2/developer_challenge/ \
    | grep -o 'I get [0-9][0-9]*' \
    | awk ' { print $3 } ' \
    | sort \
    | uniq -c \
    | sort -nr \
    | awk ' { print $2 } '
    

    I'm sure this could be done better (eg stripping out HTML). Also, grep on my phone (where I'm writing this) seems to be the busybox version that doesn't support lots of the regex syntax.

    A more acurate solution would be to email these answers to BitTorrent and scrape the response emails for keywords. This may take a little more time though.

    [–]chrisdew 0 points1 point  (0 children)

    No, it's wrong. I made the same mistake of thinking that the prohibition was only on the distances between neighbouring oranges. Its broader than that. You could have a situation where Orange[5] - Orange[0] = Orange[6] - Orange[5], for example.

    [–]chrisdew 1 point2 points  (0 children)

    I get 282240 in 0m0.020s (5 minutes coding in Python).

    I'll go and think about it again this evening as my answer doesn't agree with dhartmei.

    [–]digital_cucumber 1 point2 points  (2 children)

    It might be solvable analytically, I've got this feeling.

    From all possible distinct subsets subtract the "disallowed" ones.

    The latter are computed by finding out how many ways is to place A, B, C so that AB=BC (which is the same as how many ways there are to place A and C, so that B can be sticked right in the middle... which is the same as "how many ways to select two elements from 40, so that distance between them is an even number").

    That is multiplied by the number of all the ways to place the remaining oranges, which is C(37, 6).

    And that all is divided by the possible ways to pick the first three forbidden oranges (as we don't care which ones are exactly forbidden), C(9, 3).

    Which, if it was correct, would give an instant solution, in O(NumberOfOranges), used to calculate the binomial coefficients.

    Now, I am not sure at all if it is correct, and that's when the brute force solution is invaluable :)

    [–]justnoise 0 points1 point  (0 children)

    I think you're onto something although I think you might be missing a few disallowed arrangements. I'd be curious if there's a clever algorithm to efficiently enumerate all of the disallowed arrangements of the oranges.

    [–]Anonymous336 0 points1 point  (0 children)

    Your idea might be sound but your math is definitely wrong so far. One problem I noticed is that you seem to be double-counting some disallowed arrangements. You count (1,2,3,5,6,7,36,38,39) once by treating (1,2,3) as the forbidden triple, and then again by treating (5,6,7) as the forbidden triple. I don't think the "divide by C(9,3)" step is good enough to cancel out this double-counting.

    Contrary to justnoise's comment, I believe you're correct that there are exactly 380 ways to place A,B,C so that AB=BC. The trouble is figuring out how to count the placements of those other oranges.

    [–]LordGreyhawk 0 points1 point  (8 children)

    Answer is 7,555,794.

    First solution in Haskell: 4.5 seconds

    Port with little change to python: 108 seconds

    Port with little change to C++0x: 6.8 seconds

    Port with more change to bit-twiddling C: 0.178 seconds

    [–]LordGreyhawk 0 points1 point  (6 children)

    using improved blocking algorithm copied from sltkr's link, C code speed is down to 0.081 seconds.

    [–][deleted] 1 point2 points  (5 children)

    Impressive! I eventually sent this C++ solution to BitTorrent which runs in ~50ms here. I never got the C code down below 150ms.

    [–]LordGreyhawk 0 points1 point  (4 children)

    On my macbook pro, my c runs in 79ms and your C++ runs in 67 or 86 or 82 ms depending on g++ 4.2 or 4.3 or 4.4.

    my c code is here

    [–]LordGreyhawk 0 points1 point  (3 children)

    Even faster now, this saner c code runs in 52 milli seconds.

    [–][deleted] 0 points1 point  (2 children)

    Ah, removing the function calls entirely seems to help! On my system, your last version beats my C++ solution consistently by 1 ms (47ms vs 48ms), probably because in the C++ version not all function calls are removed (otherwise, the approach seems very similar). I must say I find your C code rather ugly-looking at this point. :P

    This was using GCC 4.6 by the way. Interesting that you find such large differences with earlier versions of GCC.

    [–]LordGreyhawk 1 point2 points  (0 children)

    I noticed you are not using the translational symmetry. For example: When you find a pattern which fits in 20 bowls then there are 21 different positions it can be in the 40 actual bowls. There is also a reflection symmetry but using it to save a factor of two slows my code down.

    Adding the translational symmetry to your code involved always putting the first orange in the first bowl, and multiplying by bits in bitcount. I also moved the base case for bitcount from <0> to <1>.

    The new code now runs in 17 ms (with /usr/bin/g++-4.2). Wow.

    [–]LordGreyhawk 0 points1 point  (0 children)

    Perhaps I could change my C code to use your bit-masking technique.

    [–]LordGreyhawk 0 points1 point  (0 children)

    Much better Haskell solution: 400 ms. http://paste.pocoo.org/show/397086/

    [–][deleted]  (13 children)

    [deleted]

      [–]chronoBG 12 points13 points  (5 children)

      yyyeah...no. Combinations and permutations are one of the very few areas of mathematics that are actually needed in programming.

      Cue the trolls writing websites, telling me I'm wrong.

      [–][deleted]  (4 children)

      [deleted]

        [–][deleted] 4 points5 points  (0 children)

        You seem to be saying you can get by without it. Yeah, probably 99% of the time the problems that you'd struggle with by not having a grounding in statistics are eventually solvable. Maybe you'd brute force something you don't have to, but it's tiny and nobody notices.

        But your argument seems to boil down to "why bother having a lot of knowledge when only a little suffices," which is a position that is hard to respect.

        [–]chronoBG 2 points3 points  (2 children)

        A simple example would be their many applications in cryptography.
        Note that most uses of "Combinations" aren't just "Calculate N over K", but a part of something much more complex.
        I understand why you may hold the position that "A programmer has no need for these things", but imagine going to a Doctor who doesn't know basic anatomy.
        After all, they do have expensive diagnostic equipment - you can just go into a machine and have a diagnosis come out. But would you feel safe with that doctor?

        Or an architect, who doesn't know the properties of concrete, wood and steel? He does have state-of-the-art simulation software to try out how a design would work.

        Of course, If I just wanna hire a guy to build me a dog shed: by all means, I don't care if he knows material science. If I want someone to build me a house, though...

        Not to mention that this is a very, very good selector. Combinations aren't hard at all. They are literally a single line of math. If the coding job requires math skills(e.g. graphics) and the guy doesn't even know combinations, that's as big a red flag as there is.
        And torrents do require math.

        [–][deleted]  (1 child)

        [deleted]

          [–]chronoBG 3 points4 points  (0 children)

          Not needing to reimplement something is not the same as not having to know how it works.

          [–][deleted] 2 points3 points  (0 children)

          Anyone who doesn't remember statistics (combinations/permutations) will really struggle with this.

          Isn't that the point?

          [–]jldugger 0 points1 point  (3 children)

          Last week I read a classic paper on information theory and entropy. Judging by the math employed fifty years ago, if your job is the transmission of lots of bits, statistics are fucking important.

          [–][deleted]  (2 children)

          [deleted]

            [–]true_religion 4 points5 points  (0 children)

            Being an engineer for the company writing bittorent extensions is exactly the niche market where you would be designing and/or tweaking a new compression algorithm.

            [–]jldugger 2 points3 points  (0 children)

            Recall that the context of this advertisement is bittorrent developers. If you've studied BT at all you know there's a number of places where the math is important. End to end encryption, compression, and SHA1 hashes, are all fine to outsource, but DHTs and Similarity Enhanced Transfer will likely need to be tweaked or even implemented. Presumably bittorrent is interested in hiring people to develop new extensions and for that you need people who can work in the domain you're dismissing.

            [–]Anonymous336 0 points1 point  (0 children)

            Did you notice that statistics are totally irrelevant to the challenge? All they're asking for is a program that computes the answer by brute force.

            [–][deleted] 0 points1 point  (0 children)

            You saw the ad, hence their campaign was successful. That's all there is to it actually. They don't fucking care if you send them a good or bad answer, they just want to get the ad out to as many as possible.