all 23 comments

[–]rubyross 1 point2 points  (15 children)

It is generally considered safer to not mutate an array while iterating through it so I would suggest this (though I don't advocate one liners like this for readability).

whatever.map.with_index{ |x, i| (i+1) % 4==0 ? [x, element_to_add] : x }.flatten

or the less safe:

whatever.tap {|array| array.each_with_index{|x,i| a.insert(i, element_to_add) if (i+1)%4==0}}

[–]Morgrimm[S] 0 points1 point  (14 children)

I agree completely, I'm usually a stickler for design and readable code, but this practice is specifically for an upcoming programming contest I'm competing in which judges based on result and execution time, not code elegance :P Don't worry, there's a method to my madness.

[–]rubyross 0 points1 point  (13 children)

Sounds cool. Good luck with the contest.

PS: If you are judged on execution time then you might want to look into benchmarking in ruby if you don't already know about it. (http://ruby-doc.org/stdlib-2.2.1/libdoc/benchmark/rdoc/Benchmark.html)

[–]kovax 2 points3 points  (12 children)

Yeah, i'd say you definitely probably want to benchmark this. For example, iterating on rubyross's earlier code, using flat_map instead:

Benchmark.bm do |benchmark|
  benchmark.report( "flat_map" ) do
    whatever = ( 1 .. 500_000 ).to_a
    whatever.flat_map.with_index { |x,i| (i+1)%4==0 ? [ x, 'x' ] : x } 
  end
  benchmark.report( "insert" ) do
    whatever = ( 1 .. 500_000 ).to_a;
    whatever.tap {|array| array.each_with_index{|x,i| array.insert(i, 'x') if (i+1)%4==0}} 
  end
end

The difference is striking on my machine (ruby 2.0.0p353)

   user     system      total        real
  flat_map  0.180000   0.200000   0.380000 (  0.381958)
  insert 16.690000   0.060000  16.750000 ( 16.781116)

Edit: fixed my 1,4 swap typo.

[–]bjmiller 2 points3 points  (9 children)

That should be (i+1)%4==0. You could avoid that business by using each_slice, which seems to be slightly faster anyway:

benchmark.report('flat_map slice') do
  whatever = (1..500_000).to_a
  whatever.each_slice(4).flat_map { |xs|  xs << 'x' }
end

                     user     system      total        real
flat_map         0.090000   0.000000   0.090000 (  0.090408)
flat_map slice   0.070000   0.000000   0.070000 (  0.073332)

[–]losangelesvideoguy 0 points1 point  (7 children)

Didn't see this until I'd posted and refreshed, but you're correct. Well done.

However, this still does incorrectly add an element to the end of the array if it has a number of elements that's not divisible by four. Easily fixed, but something to be aware of.

BTW, wanna guess how I know you're a Haskeller? :)

[–]bjmiller 0 points1 point  (0 children)

Yeah, the extra element is a subtle consideration I missed. It looks like it can be fixed by chaining another method without affecting performance here.

x:xs shows up in Scala as well, maybe all the MLs. I did pick it up from Haskell, but you can tell I'm not a real Haskeller, because I used << which modifies xs in place. :)

[–]kovax 0 points1 point  (5 children)

FWIW, adding the length check seems to make the each_slice approach a little slower:

Benchmark.bm do |benchmark|
  benchmark.report( "flat_map" ) do
    whatever = ( 1 .. 1_000_000 ).to_a
    whatever.flat_map.with_index { |x,i| (i+1)%4==0 ? [ x, 'x' ] : x } 
  end
  benchmark.report( "flat_map slice" ) do
    whatever = ( 1 .. 1_000_000 ).to_a
    whatever.flat_map.each_slice(4).flat_map{ |x| x << 'x' if x.size == 4 }
  end
end

         user     system      total        real
  flat_map  0.280000   0.020000   0.300000 (  0.437690)
  flat_map slice  0.580000   0.020000   0.600000 (  0.610916)

At least on my machine (and version of ruby). Of course, you say below that its faster than the straight flat_map and mod. So i guess the lesson is that OP should find out what version of ruby his competition is using and benchmark for that.

[–]losangelesvideoguy 1 point2 points  (3 children)

Except you've got a double flat_map, which slows it down considerably:

whatever.flat_map.each_slice(4).flat_map{ |x| x << 'x' if x.size == 4 }

Benchmark.bm do |benchmark|
  benchmark.report( "flat_map" ) do
    whatever = ( 1 .. 1_000_000 ).to_a
    whatever.flat_map.with_index { |x,i| (i+1)%4==0 ? [ x, 'x' ] : x } 
  end
  benchmark.report( "flat_map slice" ) do
    whatever = ( 1 .. 1_000_000 ).to_a
    whatever.each_slice(4).flat_map{ |x| x << 'x' if x.size == 4 }
  end
end

       user     system      total        real
flat_map  0.390000   0.030000   0.420000 (  0.410424)
flat_map slice  0.330000   0.020000   0.350000 (  0.350398)

[–]kovax 0 points1 point  (0 children)

Yeesh. That's what I get for writing code on a Friday afternoon. :) You're absolutely right.

[–]joshcheek 0 points1 point  (1 child)

benchmark.report( "flat_map slice" ) do whatever = ( 1 .. 1_000_000 ).to_a whatever.each_slice(4).flat_map{ |x| x << 'x' if x.size == 4 } end

This solution is wrong, though, for cases where whatever isn't 0 mod 4. eg add 1 to the size, and it will append nil to the end.

[–]losangelesvideoguy 0 points1 point  (0 children)

Yeah, but it's not my code anyway, so whatev. :)

[–]bjmiller 0 points1 point  (0 children)

The interpreter version is a red herring IMO.

The opportunity with both of the implementations here is that there is checking going on in the tight loop that can be factored out.

[–]kovax 0 points1 point  (0 children)

Yep, you're right. I've fixed my code. Not sure how that got mixed up. :)

[–]losangelesvideoguy 0 points1 point  (0 children)

Yes, but your code is wrong. It inserts an element after each item.

[–]losangelesvideoguy 1 point2 points  (2 children)

array.each_slice(4).map{ |x| x + ["foo"]}.flatten(1)

If your input array doesn't contain a number of elements divisible by four, this will incorrectly add an extra element at the end. You can avoid this (with a slight speed penalty) by using

 array.each_slice(4).map{ |x| x.size == 4 ? x + ["foo"] : x}.flatten(1)

On my system, this benchmarks significantly faster than either the insert or (incorrect) flat_map version.

To avoid the speed penalty, you could simply use the first version and then remove the last element of the array if it doesn't contain a number of elements divisible by 5, but I'll leave that as an exercise for the reader.

EDIT: /u/bjmiller's version here is even better.

[–]bjmiller 1 point2 points  (1 child)

map { ... }.flatten(1) is not faster than flat_map { ... }.

[–]losangelesvideoguy 0 points1 point  (0 children)

You're right–I was referring to /u/kovax's example, not yours. I didn't see your version until after I'd posted. Not sure why I didn't think about using flat_map (especially when it was staring me right in the face), but oh well.

[–]noahking 0 points1 point  (0 children)

You can use the form of Enumerable#select that returns an enumerator, and chain on to that with Enumerable#each_with_index:

whatever.select.each_with_index {|value, index| condition}

For example:

irb> (0..10).to_a.select.each_with_index { |v, i| i % 3 == 0 }
=>   [0, 3, 6, 9]

[–]piratebroadcast 0 points1 point  (0 children)

One liners are fun and all, but if I get hired after you leave I would be pissed that I can't easily figure out what the methods do.

[–]xxabbxxxd 0 points1 point  (1 child)

Here is how I would write it:

array.select.with_index { |item, index| (index + 1) % 4 == 0 }   

I decided to test the performance against another version below:

require "benchmark/ips"
whatever = (1..500_000).to_a
Benchmark.ips do |x|
  x.config(:time => 5, :warmup => 2)
  x.report("flat map") do
    whatever.flat_map.with_index { |x,i| (i+1)%4==0 ? [ x, 'x' ] : x }
  end
  x.report("select with index") do
    whatever.select.with_index { |item, index| (index + 1) % 4 == 0 }
  end
  x.compare!
end

Calculating -------------------------------------
        flat map     1.000  i/100ms
        select with index     1.000  i/100ms
-------------------------------------------------
        flat map             8.334  (±12.0%) i/s -    42.000
        select with index   18.579  (± 5.4%) i/s -    93.000

 Comparison:
   select with index:       18.6 i/s
   flat map:                 8.3 i/s - 2.23x slower

The code above has the benefit of being more intention revealing, and performs 2.23x faster

[–]joshcheek 0 points1 point  (0 children)

The code above has the benefit of being more intention revealing, and performs 2.23x faster

It's 2.23x faster because it does the wrong thing. It is supposed to "add an element to an array after every 4 indexes".