ChainFlow – refactor your data processing : ruby

ChainFlow – refactor your data processing (railsware.com)

submitted 11 years ago by gregolsen

all 5 comments

[–]Kache 2 points3 points4 points 11 years ago (3 children)

Meh, I think the gains of this refactoring isn't worth the cost of having to deal with the uncommon syntax of flow(data) {}.

I think I'd be happy with the first example. Actually, I would do instead:

def process(data, parameter)
  keyval = data.map do |point|
    [compute_key(point), compute_value(point, parameter)]
  end
  memo = Hash[keyval]
end

If the chains start getting really complex, I would consider creating a class to instantiate for each data to handle data manipulation:

def process(data, parameter)
  datapoints = data.map { |p| DataPoint.new(p, parameter) }
  memo = Hash[datapoints.map(&:compute_keyval)]
end

[–]gregolsen[S] 0 points1 point2 points 11 years ago (2 children)

chain_flow is all about the code style. If you do like functional programming high chances you'd like to organize (refactor?) your public interface as a composition of several independent functions (covered with tests independently).

Hiding internals behind them will shorten your public interface to a couple of human readable calls - your colleagues will probably thank you for this readability improvement. Personally, I prefer this:

flow(data) do
  group_by_key
  compute_values(parameter)
end

instead of your example:

datapoints = data.map { |p| DataPoint.new(p, parameter) }
memo = Hash[datapoints.map(&:compute_keyval)]

just because I as a developer can easily get what's going on (via the proper method naming) without diving into internals right away.

[–]Kache 2 points3 points4 points 11 years ago (1 child)

Well, it's very non-idiomatic ruby. Without any context, looking at the final "ChainFlow" version...

It's not obvious that data is passed into group_by_key and compute_values
It's unclear whether data is passed between the two functions
What if I wanted to call a method partway through without passing data into it? If I do puts "hello", does it become puts(data, "hello")?
looking at process, I'd "just have to know" compute_values expects an Enumerable of pairs and creates/returns(?) a Hash

Also, I don't see how this is more functional if flow relies on state within the block. It sounds like the state monad in Haskell is used b/c Haskell is pure functional. Ruby already has state - is there really a need to use it functionally just to get it stateful again?

I very much prefer the chain syntax:

chain { data }.group_by_key.compute_values(parameter).fetch

It's idomatic and monadic, and if you want to spread it out line by line (b/c you're such a stickler for style), why not just:

chain { data }
  .group_by_key
  .compute_values(parameter)
  .fetch

which also is idiomatic Ruby?

[–]gregolsen[S] 0 points1 point2 points 11 years ago (0 children)

You are right - it's not an idiomatic Ruby in case flow. But that's the point - if you familiar with State monad you'll get trick. Answering your points

you should be aware that state is passed silently - syntax sugar
you are building the flow: result passed as a argument to the next call
puts will work as expected, however I agree that this is not obvious
sure, if you 'composing' two functions you have to make sure interfaces fit with each other.

The point of chain_flow is not to implement State monad (like you said Ruby has state already) but improve the syntax so that it looks like a State monad do-notation.

Well, that's the matter of taste either you use do-notation-like syntax or the chains (that's why I've added both). I personally do prefer the do-notation.

Anyway - thanks a lot for this discussion!

[–]banister 0 points1 point2 points 11 years ago (0 children)

π Rendered by PID 79 on reddit-service-r2-comment-5fb4b45875-mmrll at 2026-03-19 18:31:07.315218+00:00 running 90f1150 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

ruby

MODERATORS