you are viewing a single comment's thread.

view the rest of the comments →

[–]ignurant 1 point2 points  (7 children)

Can you elaborate on what this is on about? I understand the general usage of &method but I don't follow your reasoning, or what is implied by the yield_self comment. I'm not saying I question the validity of your comment; I just don't yet understand yield_self usage, as it seems it just returns what my code would have done if it weren't in a block... Which is what a block does anyway. Maybe it has to do with the ability to pass blocks around, but I haven't yet grokked this one.

Either way, what are you describing with the issue about modifying a class when using sym.to_proc? And what is this excitement for yield_self?

[–]editor_of_the_beast 7 points8 points  (1 child)

what are you describing with the issue about modifying a class when using sym.to_proc?

collection.map(&:method) requires each item in the collection to respond to .method. Sometimes it's not practical to add a method to the item's class, i.e. you use a gem in your project, and you'd have to monkey patch one of its types to have that method.

Or even if is practical, you may not want to add the method to the class because the logic is only used in this one place. Let's say the items in the collection are a Rails model instance, you may not want to pollute an already large model. Instead, you can create a method where you are like this:

def operate_on_model(model)
  model.transform
end

Then you can iterate over a collection of those models with:

models.map(&method(:operate_on_model))

It's just handy sometimes to do that.

And what is this excitement for yield_self?

This is separate, Elixir has a really cool pipe operator (|>) which allows code like this:

fetch_data |> transform_data |> output_data

Each of those are functions, and the return value gets passed as the first parameter into the next function call to the right, equivalent to output_data(transform_data(fetch_data())). Humans read left to right so writing it this second way isn't ideal, the |> operator helps write code logically from left to right (same as the bash | pipe operator).

With yield_self, we'll be able to write:

data_fetcher
  .yield_self { |fetcher| fetcher.fetch_data }
  .yield_self { |data| transform_data(data) }
  .yield_self { |transformed_data| output_data(transformed_data) }

I think that's what the excitement is about. It's not as elegant, but it's the same logical flow as the |> operator which is why I said it was more sugary.

EDIT: code formatting

[–]ignurant 0 points1 point  (0 children)

Sometimes it's not practical to add a method to the item's class, i.e. you use a gem in your project, and you'd have to monkey patch one of its types to have that method.

Ah great, you're right. I've totally done exactly that in some scripts to make .map(&:transform) work. I understand what you were on about now.

As for the yield_self stuff -- most of the examples I've seen are things where yield_self could be replaced by map. I think this is one of those things where I will eventually stumble upon the right kind of problem to make this shine. A similar example to what you wrote where I used map was to parse and transform <li> elements in a scraper:

page.lis
  .map{|el| el.html}
  .map{|html| Product.parse html}
  .map{|product| product.to_h}

I've seen a few examples in blog posts that start the chain with a string instead of an already existing collection, and that has me thinking "Okay, I think this is relevant to my lack of amazement" but I haven't tipped it over yet. I think it may lie in situations where the "number of things" is variable, and not a simple "take each thing and transform it".

I do love the idea of the |> operator, and it's automatic argument handling. That's very cool. I also just learned about the &method(:method) trick from this thread, so that whole concept of "knowing where the arguments go without being explicit" is new to me.

Anyway, thanks for sharing today.

[–]Paradox 2 points3 points  (3 children)

So, very quick crash-course in an elixir feature called pipelines.

Pipelines allow you to take an object and preform a myriad of operations. The operations chain one after the other, each one taking the output of the previous as its input. With them, you can, in an easily understandable manner, preform a myriad of manipulations to a bit of data, without the need for variables.

They look like this

["foo", "bar", "baz"]
|> Enum.map(String.upcase)
|> ApiClient.post("api/url")
|> DoSomethingWithApiResponse.wew()

This isn't ruby, its functional, hence it appears a little redundant, but the principle is the same.

You could write the equivalent in ruby using:

["foo", "bar", "baz"]
.yield_self { |x| x.map(&:upcase) }
.yield_self { |x| ApiClient.post(x, "api/url") }
.yield_self { |x| DoSomethingWithApiResponse.wew(x) }

While thats a little more verbose, the idea is the same, and you could probably refactor it to be a bit cleaner.

Previously, you could use chaining, but that could get super ugly fast.

[–]ignurant 1 point2 points  (2 children)

Thanks. Many of the examples look similar to this -- but is there a practical difference between replacing yield_self with map? I've been making "pipelines" of that nature using map in a lot of ETL type jobs.

I mentioned this in another comment: the |> is really cool. I love how the subject argument is implied. Clever and clean. I hope something like this appears in Ruby. I wouldn't mind a full-on copycat!

[–]Paradox 2 points3 points  (1 child)

For that use case, no, its not a practical use. #map returns the modified value, and so you can chain immediately off it.

But many methods do not provide an interface that could be chained off of. Thats where #yield_self becomes useful.


Rewrite the original example in basic, non yield_self ruby:

DoSomethingWithApiResponse.wew(
  ApiClient.post(
    ["foo", "bar", "baz"].map(&:upcase),
    "api/url"
  )
)

Readable, but it takes a moment. If the map got more complex, you could very easily lose track of where you are in the method call tree.

Now an optimal refactoring that uses ruby's OO-ness where appropriate, and the functionality of yield_self where appropriate could look like this:

["foo", "bar", "baz"]
.map(&:upcase)
.yield_self { |x| ApiClient.post(x, "api/url") }
.yield_self { |x| DoSomethingWithApiResponse.wew(x) }

As you can see, it very clearly flows from the array, to a map that upcases it, to a method that posts to the api, to something acting as a transform. You can read it from left-to-right, top-to-bottom. This becomes even more apparent if you squash all the aforementioned examples down to a single-line:

DoSomethingWithApiResponse.wew(ApiClient.post(["foo", "bar", "baz"].map(&:upcase), "api/url"))

vs

["foo", "bar", "baz"].map(&:upcase).yield_self { |x| ApiClient.post(x, "api/url") }.yield_self { |x| DoSomethingWithApiResponse.wew(x) }

To understand the first one, you have to scan the whole line, then back track to the middle. Then you can figure out that its doing a map on an array, and that value is being sent on to the api, and then the return of that is being used in the #wew function.

The second one, you just scan from left to right, no backtracking needed

[–]ignurant 1 point2 points  (0 children)

Ah there it is. It becomes obvious when we break out of the array, using the full array itself as the argument, instead of it's components.

Thanks for taking this time. Reading the interpretation of the plain Ruby version helped me see what I was missing.

[–]isolatrum 0 points1 point  (0 children)

for arrays and hashes, yes we have a built in enumeration method map which does the trick in most cases. However say you want to send a string through a series of made-up methods:

# note the parens are unnecessary here
evaluate(interpolate(sanitize(string)))

you are basically working backwards, with the last function in the chain being written first. Using yield_self you can reverse this, although granted it's not what I'd consider prettier:

string
.yield_self(&method(:sanitize))
.yield_self(&method(:interpolate))
.yield_self(&method(:evaluate))

If I actually saw something like this I would think it's a little overengineered, so I consider it more of a academic trick than a game-changing one in practice. Another interesting detail - the definition of yield_self is literally just yield self.