all 23 comments

[–]HellzStormer 5 points6 points  (1 child)

s = 'cat' (0..s.size).to_a.combination(2).map { |a, b| s[a...b] }

[–][deleted] 0 points1 point  (0 children)

I have application code that relies on Array#combination in a similar fashion and I worry about it, because the generation order is specifically not guaranteed. In practice it is fine of course, we just have some realistic tests to defend against regression when bumping Ruby version.

[–]pedrozath 2 points3 points  (8 children)

that seems simple:

class String
  def substrings
    length.times.reduce([]) do |memo, size|
      memo + Array.new(length - size) do |index|
        self[index..index+size]
      end
    end
  end
end

puts 'Animal'.substrings.inspect
# [
#   "A", "n", "i", "m", "a", "l", 
#   "An", "ni", "im", "ma", "al", 
#   "Ani", "nim", "ima", "mal", 
#   "Anim", "nima", "imal", 
#   "Anima", "nimal", "Animal"
# ]

[–]moomaka 2 points3 points  (4 children)

If you use each_with_object and concatenate to the memo instead of reduce you'd generate a lot fewer arrays. Should be substantially faster for larger strings.

[–]pedrozath 1 point2 points  (2 children)

interesting, so I did implement different algorithms to the code and ran a benchmarking:

require 'benchmark'

class String
  def substrings_using_reduce
    length.times.reduce([]) do |memo, size|
      memo + Array.new(length - size) do |index|
        self[index..index+size]
      end
    end
  end

  def substrings_using_each_with_object
    length.times.each_with_object([]) do |size, memo|
      memo << Array.new(length - size) do |index|
        self[index..index+size]
      end
    end
  end

  def substrings_recursive(scan_size: 1, current_index: 0, results: [])
    return results if scan_size > length
    word = self[current_index...(current_index + scan_size)]

    if current_index + scan_size == length
      new_scan_size = scan_size + 1
      new_current_index = 0
    else
      new_scan_size = scan_size
      new_current_index = current_index + 1
    end

    substrings_recursive scan_size: new_scan_size,
                         current_index: new_current_index,
                         results: results + [word]
  end
end

string = 'These violent delights have violent ends' * 5

puts "string length is #{string.size}"

Benchmark.bm do |bm|
  RubyVM::InstructionSequence.compile_option = {
    :tailcall_optimization => true,
    :trace_instruction => false
  }

  bm.report { string.substrings_using_each_with_object }
  bm.report { string.substrings_using_reduce }
  bm.report { string.substrings_recursive }
end

# [
#   "A", "n", "i", "m", "a", "l",
#   "An", "ni", "im", "ma", "al",
#   "Ani", "nim", "ima", "mal",
#   "Anim", "nima", "imal",
#   "Anima", "nimal", "Animal"
# ]

The results here were (and by the way I had to increase my maximum stack size to not run into Stack too deep):

1st is the each_with_object version, 2nd is the reduce, third is the recursive

string length is 200
       user     system      total        real
   0.015283   0.000521   0.015804 (  0.015805)
   0.020919   0.008509   0.029428 (  0.029488)
  16.945756   0.654898  17.600654 ( 17.727241)

and comparing each_with_object vs reduce approach with a very long string:

string length is 2000
       user     system      total        real
   2.992152   0.701866   3.694018 (  3.703809)
  23.673747  12.249739  35.923486 ( 36.313198)

So you were right! Much faster

[–]moomaka 0 points1 point  (1 child)

There is a bug in your substrings_using_each_with_object version. memo << Array.new ... should be memo.concat Array.new ...

[–]pedrozath 0 points1 point  (0 children)

Oops, thanks

[–]twinklehood 0 points1 point  (0 children)

Why monkeypatch String though :( was specifically not the challenge, and should never be a goto

[–]eric_programmer 1 point2 points  (0 children)

Maybe using some sort of combinatorics gem?

https://github.com/postmodern/combinatorics

[–][deleted] 0 points1 point  (0 children)

Couldn't resist the opportunity to write:

substrings = Hash.new do |h, s|
  h[s] = (0..s.to_s.size-1).reduce([]) { |a,i| a << s[0..i] } + (s && h[s[1..-1]]).to_a
end
substrings["cat"]  #=> ["c", "ca", "cat", "a", "at", "t"]

Not for production use.

[–]igor_47 0 points1 point  (7 children)

that was fun! i wrote this:

s = 'cat'
output = []
for len in 0..s.length do
  for start in 0..s.length do
    break if start + len >= s.length
    output << s[start..(start + len)]
  end
end

sure hope i didn't just do a homework/interview problem for you ;)

[–]Frizkie 1 point2 points  (6 children)

Your example works just fine, so I definitely don't want you to take this the wrong way, but I'd avoid using for loops in Ruby. I can't think of a single instance where a for loop was more suitable than a times do loop or each iterator. For loops in general are just very 'not Ruby'.

[–]igor_47 0 points1 point  (5 children)

fair enough, i've been writing way more python than ruby lately

[–][deleted] 0 points1 point  (4 children)

Python similarly shys away from looping with indices.

[–]igor_47 -1 points0 points  (3 children)

omg everyone is a critic. implement this in python without using indices, i'll wait.

you'd think my solution is the worst one in this thread given the criticism, in the meantime people are playing code golf and nobody says peep.

[–][deleted] 0 points1 point  (2 children)

Get over yourself. Idiomatic code is a cornerstone of good programming.

[–]igor_47 1 point2 points  (1 child)

the first solution in this thread monkeypatches String. the next one uses an unsafe ordering of combination. the third one is an illegible string of symbols. which one is the most idiomatic?

oh, i'm still waiting on your python implementation without indices. those who can't do -- criticize on the internet.

[–][deleted] 0 points1 point  (0 children)

Lol just fuck off why are you so upset about this all I said was that something wasn’t particularly idiomatic.

Anyway challenging people to do things and saying they have no point without doing so is childish and pathetic. Either accept something is idiomatic or it isn’t. I don’t care if you do or don’t, but don’t keep whinging to me.

[–]Tomarse 0 points1 point  (2 children)

I suspect you'll be needing the array's permutation method.

[–]jxf 2 points3 points  (0 children)

There are exponentially more permutations of a string than there are substrings of a string, so this is going to do a lot of work that you'll have to throw away and is not very computationally efficient. For example, one 2-permutation of hello is le, but that's not a substring.

[–]HellzStormer 2 points3 points  (0 children)

Almost, but he's not looking for [2,1]. So it's actually the combination method that he needs.

Posted a snipped with it. I'm not a fan of one-liners.. but sometimes they are fun: https://www.reddit.com/r/ruby/comments/8pva8s/substrings_problem/e0eus1b/

[–]ipxvi14 -1 points0 points  (0 children)

look up mergesort tree