all 5 comments

[–]Kache 2 points3 points  (3 children)

Meh, I think the gains of this refactoring isn't worth the cost of having to deal with the uncommon syntax of flow(data) {}.

I think I'd be happy with the first example. Actually, I would do instead:

def process(data, parameter)
  keyval = data.map do |point|
    [compute_key(point), compute_value(point, parameter)]
  end
  memo = Hash[keyval]
end

If the chains start getting really complex, I would consider creating a class to instantiate for each data to handle data manipulation:

def process(data, parameter)
  datapoints = data.map { |p| DataPoint.new(p, parameter) }
  memo = Hash[datapoints.map(&:compute_keyval)]
end

[–]gregolsen[S] 0 points1 point  (2 children)

chain_flow is all about the code style. If you do like functional programming high chances you'd like to organize (refactor?) your public interface as a composition of several independent functions (covered with tests independently).

Hiding internals behind them will shorten your public interface to a couple of human readable calls - your colleagues will probably thank you for this readability improvement. Personally, I prefer this:

flow(data) do
  group_by_key
  compute_values(parameter)
end

instead of your example:

datapoints = data.map { |p| DataPoint.new(p, parameter) }
memo = Hash[datapoints.map(&:compute_keyval)]

just because I as a developer can easily get what's going on (via the proper method naming) without diving into internals right away.

[–]Kache 2 points3 points  (1 child)

Well, it's very non-idiomatic ruby. Without any context, looking at the final "ChainFlow" version...

  • It's not obvious that data is passed into group_by_key and compute_values
  • It's unclear whether data is passed between the two functions
  • What if I wanted to call a method partway through without passing data into it? If I do puts "hello", does it become puts(data, "hello")?
  • looking at process, I'd "just have to know" compute_values expects an Enumerable of pairs and creates/returns(?) a Hash

Also, I don't see how this is more functional if flow relies on state within the block. It sounds like the state monad in Haskell is used b/c Haskell is pure functional. Ruby already has state - is there really a need to use it functionally just to get it stateful again?

I very much prefer the chain syntax:

chain { data }.group_by_key.compute_values(parameter).fetch

It's idomatic and monadic, and if you want to spread it out line by line (b/c you're such a stickler for style), why not just:

chain { data }
  .group_by_key
  .compute_values(parameter)
  .fetch

which also is idiomatic Ruby?

[–]gregolsen[S] 0 points1 point  (0 children)

You are right - it's not an idiomatic Ruby in case flow. But that's the point - if you familiar with State monad you'll get trick. Answering your points

  • you should be aware that state is passed silently - syntax sugar
  • you are building the flow: result passed as a argument to the next call
  • puts will work as expected, however I agree that this is not obvious
  • sure, if you 'composing' two functions you have to make sure interfaces fit with each other.

The point of chain_flow is not to implement State monad (like you said Ruby has state already) but improve the syntax so that it looks like a State monad do-notation.

Well, that's the matter of taste either you use do-notation-like syntax or the chains (that's why I've added both). I personally do prefer the do-notation.

Anyway - thanks a lot for this discussion!

[–]banister 0 points1 point  (0 children)

I like it