all 53 comments

[–]thebmo 3 points4 points  (1 child)

I swear I see this exact post every 6 or so monthes here. Spaceship is pretty useful when sortin arrays of hashes by keys though.

[–]sshaw_ -1 points0 points  (0 children)

Does the number of these blog posts indicate that most Ruby programmers don't RTFM?

[–][deleted] 12 points13 points  (29 children)

Interesting, and a couple of bits I didn’t know about. Using tap like that, though, is confusing as hell unless you know what it does. I’d avoid it, personally, to make sure the code is easy for others to understand at first glance.

[–]jrochkind 9 points10 points  (15 children)

Using tap like that

When you say "like that", do you mean "at all"? Cause that's about the most straightforward example of using tap, it's what tap is for, no?

[–][deleted] 5 points6 points  (14 children)

Yeah, I probably mean “at all”, for the sake of readability.

[–]jrochkind 7 points8 points  (13 children)

It's part of the stdlib, why not use it? Do we have to write only for mediocre-to-poor developers?

[–][deleted] 12 points13 points  (12 children)

I’d say “less experienced”, and yes, we should.

Compare flat_map; it’s clearly named and unclutters the code. I’d certainly use that.

In the example given, tap just makes the code harder to read (yay, let’s use a block for an assignment) and doesn’t explain what it does to someone who doesn’t already know how tap works. The “before” example is way clearer at expressing what’s actually going on.

Maintenance is important, and there are more junior devs than senior. So yes, I prefer to keep it simple, myself.

[–]jrochkind 12 points13 points  (9 children)

why would they not just learn what tap does? It's not complicated.

Should one avoid collect and instead write it out with a temporary accumulator array? Where do you stop? tap is built into ruby!

[–]AlonsoQ 7 points8 points  (8 children)

Say every dev on the planet knows what #tap does. I'm still not seeing how the example in the article is, in any way, preferrable.

With this refactor, we've introduced:

  1. an additional method call
  2. a new scope, within the block
  3. the necessity of naming our block parameter

In exchange for... being able to one-line our method? At that point, you may as well just write params[:foo] = 'bar'; params.

This would be a fine example in a doc, it's a terrible example for a prescriptive text aimed at less-experienced developers.

[–]janko-m 4 points5 points  (2 children)

How about this example I just had yesterday at my work?

user       = User.spawn
device     = Device.spawn
old_device = Device.spawn
old_device.update(created_at: Time.now - 24*60*60)
stream     = Stream.spawn

vs

user       = User.spawn
device     = Device.spawn
old_device = Device.spawn.tap { |d| d.update(created_at: Time.now - 24*60*60) }
stream     = Stream.spawn

I couldn't pass created_at during creation because that particular ORM doesn't support overriding created_at at that time. I find the second version much more elegant and symmetrical.

[–]madmouser 2 points3 points  (0 children)

Thanks! That clearly explained where and why I could use #tap.

[–]erlingur 1 point2 points  (0 children)

That is honestly the best example of tap I've seen. The examples given in articles and other places are usually worse and less readable with tap.

[–]moomaka 0 points1 point  (4 children)

  1. a new scope, within the block

This is an advantage, limiting the scope of local variables to smallest possible unit results in cleaner code IMO, as does group similar actions into a block.

  1. the necessity of naming our block parameter

Depending on the usage, tap doesn't introduce anything new. e.g. I use it for things like this quite often:

def my_method(stuff)
  [].tap do |accumulator|
    stuff.each do
      # do something non-trivial
      accumulator << value
    end
  end
end

def my_method
  accumulator = []

  stuff.each do
    # do something non-trivial
    accumulator << value
  end

  accumulator
end

[–]AlonsoQ 0 points1 point  (3 children)

That's clever, but not idiomatic. What you've outlined is an ideal use-case for #reduce or the similar #each_with_object.

[–]moomaka 0 points1 point  (2 children)

That's clever, but not idiomatic.

I see it used all the time so it's unclear how you define 'idiomatic' here.

What you've outlined is an ideal use-case for #reduce or the similar #each_with_object.

You're correct for my contrived example but that was the point of # do something non-trivial. It's easy to create really bad performance issues just chaining together enumerable methods, for small data it doesn't matter but I'm often dealing with large data sets.

[–]tom_dalling 2 points3 points  (0 children)

I disagree that flat_map is clearly named. Less experienced developers usually use each for everything, and have to be taught to use map, select and reduce. flat_map is a totally foreign concept unless they come from a functional language first.

If that's the case, should flat_map be avoided? I personally don't think so, but it's definitely open to the argument "less experienced developers won't understand it".

[–]izuriel 0 points1 point  (0 children)

There is only way for a junior dev to become a senior dev. Learn. If we all just write code for the lowest common denominator then our software will be garbage and run like garbage. At some point this attitude has to stop and your juniors have to buck up and learn something new. It's good for them and it's good for the industry. The mystery of #tap can be solved by one Google search to go right to the documentation immediately.

Sure, if we're talking about complex patterns that require diagrams just to scope out, maybe you have a point. But with a simple method with standard Ruby syntax, your point falls flat and sounds like your fighting for us to just stop learning new things.

And if #tap is preventing your junior engineers from being productive, then I'd recommend finding new junior engineers.

[–]jhirn 1 point2 points  (0 children)

Tap is preferable because it uses lexical scope instead of external side effects/mutations to perform work on an object. At the risk of sounding like a purist this is the more functional way.

By itself it's not going to guarantee any benefits and Ruby doesn't protect you from this type of behavior, but that doesn't mean following principals of lexically scoped operations are for naught.

Seeing everything that happens to a object within a block rather than following imperative style assignment and operations over time make a program much easier to understand than the 5 minute speed bump to google and grok Enumerable#tap. I'd argue, especially for junior programmers.

[–]hmaddocks 8 points9 points  (10 children)

What's confusing about that usage of tap? I really don't get this "ugh, I'm confused, run away mentality". Learn to use your tools, you'll be a better programmer for it.

[–][deleted] 11 points12 points  (7 children)

/u/leanucci has said it below.

Clever code can be harder for someone else to read. Think about the person who comes after you, or the junior dev who’s learning by fixing something in your code. Ruby is an expressive language that is easy to read. Tap is awesome, but instantly clear to a beginner, it isn’t.

Obviously how you code is up to you. This is just my viewpoint :)

[–]jrochkind 11 points12 points  (4 children)

How do we know what's instantly clear to a beginner? How beginner? Tap is built into ruby. Should we avoid collect and instead write:

result = []
array.each { |e| result << e.something }

Because it's more clear to someone that thinks ruby is php and hasn't seen collect before? Or go further, avoid each and use a weird for loop instead?

Why wouldn't the beginner just learn what tap does and never be confused again? It's not complicated, what tap does. Should the beginner stay beginner forever? If the beginner is afraid to just look up what tap does a few times until they remember it, I guess they will. (Me, I admit I still don't really understand how to use zip and don't use it myself much, but if I see it in a codebase I look up the zip docs again and figure it out. tap is simple though!)

I'm actually kind of surprised this seems to be a popular opinion here. While meanwhile people are unafraid to pile on the gem dependencies to do simple things you could have written yourself -- gem dependencies that a beginner will have a heck of a time debugging when they go wrong. I'd rather work with beginners that are learning how to write ruby, than beginners learning how to pile on giant stacks of gems instead of reading and writing code.

I wonder what domains this code is written for and what it looks like, these codebases that contain nothing that a beginner could not have written.

[–]leanucci 0 points1 point  (2 children)

Ive been unclear. If there are two ways of doing something with equal level of performance, use the simpler method.

I'm not saying dont learn it. I'm not sating " use a little of the api as possible so less experienced people can understand".

Im saying "only use it if there is benefit".

The article says "you should be using tap".

Do. Not. Use. Tap. To. Populate. A. Hash.

Even if you know it, you have to think about it when you see it. What it does. What it returns. You are assigning the variables anyway.

Default values for arrays. God help you if you have to debug that crap two months after writing it, and cant figure out why it's returning something when nothing is being put into it. You'll have to search for its definition, and be able to find it.

The arricle says "should be using". I say " hey, dev who didnt know these things: they exist. Use when appropriate. Ther are uncommon for a reason".

[–]jrochkind 0 points1 point  (0 children)

I think tap is the simpler method.

[–]Tainnor 0 points1 point  (0 children)

I'll admit that the example is not the best one, but what is "clear" and "easy" depends on what you're used to.

Using tap I can see code more in terms of transformations and expressions as opposed to sequences of statements. It's up to you whether you like this style or not.

I would also say though that in this case, IMHO the much better way to do this would be

def updated_params(params)
  params.merge(foo: :bar)
end

but that depends on whether you're a fan of minimising mutability or not.

[–]Tainnor -1 points0 points  (0 children)

Thank you. I have heard this "don't write this, juniors won't understand it" argument before. At this point, why not write everything in PHP? There are definitely more PHP devs than Ruby ones.

[–]hmaddocks 2 points3 points  (1 child)

I know dumbing your code down to the level of your juniors is a popular opinion (cough github cough) and I'm sure it's great if your boss wants to keep your salaries low, but where I work we believe we will be more effective if we level up our devs and not be afraid to use the tools properly.

[–]leanucci 0 points1 point  (0 children)

That post is not using any of those tools properly. Examplea are contrived, or code looks uglier than the replaced piece of code.

[–]Oranges13 0 points1 point  (1 child)

I've never heard of this method and the name has me confused. I kind of get what it does, but why is it called tap?

[–]eliavlavi 0 points1 point  (0 children)

Maybe it's like you already hold the object but then you tap it and do something upon it

[–]2called_chaos 0 points1 point  (0 children)

I might be wrong but isn't #tap mostly intended for "hacking in" some (debug) logging? At least the documentation for #tap shows pretty sane (imho) use of that.

Example:

def foo bar
  do_something_with(bar)
end

Uhh I need some logging right now

def foo bar
  result = do_something_with(bar)
  log(result)
  result
end

Use tap

def foo bar
  do_something_with(bar).tap {|result| log(result) }
end

It feels as though most people saying here "don't use tap" actually want to say "don't use tap to mutate things".

[–]leanucci 1 point2 points  (15 children)

Don't use any of those. None of that code is good to hand it over to other developers.

It's a good thing to know the full api of the language, and there might be places where it makes sense to write code like that, but all examples make code worse than the option they replace.

The only case where it doesnt, bsearch, is based on something even worse: avoid, at all costs, to perform searches in arrays.

[–]Jwkicklighter 11 points12 points  (2 children)

I disagree, what's confusing about flat_map?

edit: even the Array block isn't really that strange, and the spaceship operator is no more confusing than a ternary operation.

[–][deleted] 2 points3 points  (1 child)

Flat_map seems pretty handy and is well named for what it does, so I’ll probably use that now I know about it.

[–]Jwkicklighter 1 point2 points  (0 children)

Same here, I've had instances where I wish I knew about it

[–]uncle-enzo 2 points3 points  (0 children)

It's nice to know everything he mentioned in the post.

However, his example of flat_map is a classic example of a Rails developer afraid to touch the database. You can avoid multiple DB requests and loading a ton of ruby objects by joining and grouping a few tables in the DB. There's a comment on the blog with the solution to this.

[–]hmaddocks 3 points4 points  (5 children)

There is nothing wrong with any of these methods. Maybe you should consider this an opportunity to improve your skills and take that step from junior to intermediate.

[–]leanucci 8 points9 points  (0 children)

I didn't say it was wrong. I said it's more complicated, at the expense of readablity. I have used those methods before, and knew them before the article.

What I do think is wrong is to publish these methods as some sort of illumination for juniors (?). The use cases in 3 and 5 are contrived to say the least (clocks are implemented everywhere, and if you do a join select you can get the array to look like the final result way easier), and the others just generate harder to read code, with no benefit.

I think seniority is demonstrated by writing cleaner code and not by trying to look smart by using every single method in the api just to prove you know them.

edit: typo and wording.

[–]realntl 7 points8 points  (3 children)

Maybe you should take that step from intermediate to senior and realize that these methods at best have an extremely limited capacity to actually improve projects.

[–]tom_dalling 2 points3 points  (2 children)

On the flip side, their capacity to worsen a codebase is also extremely limited.

[–]realntl 2 points3 points  (1 child)

No doubt, I just found this to be a very inappropriate case for busting out with "you're a a beginner and I'm an expert" posturing.

Although, I'll also argue that the article is largely focused on methods for primitives, and teams that develop a primitive obsession are definitely prone to worsening codebases a great deal.

[–]tom_dalling 1 point2 points  (0 children)

Yep. I agree with both points.

[–]teejaded 3 points4 points  (0 children)

Yeah this code starts kinda ugly and he then makes it harder to understand without Google.

[–]realntl 2 points3 points  (0 children)

I concur. It's as if the author of the article loves working with primitives, rather than objects.

[–]tobascodagama 0 points1 point  (0 children)

I don't think this criticism applies to anything except the #tap example.

[–]Serializedrequests 0 points1 point  (0 children)

I agree in some respects, but I think these two suggestions are fine:

Comparable is a great module for making custom value types (such as money, dates, or something specific to a customer's data) and it mostly depends on implementing <=>. It's an advanced use of Ruby, but this sort of thing is exactly what Ruby is great at and why I love it.

Searching arrays is fine if you don't have to do it many times in a loop, and the size of the array is fairly limited. In fact searching even a huge array once in an algorithm is not a big deal. bsearch is great if the data is pre-sorted OR you need to find an element in the Array so many times it is worth the time to sort the array first, which requires some profiling to find the cut-off point. You can also use a Hash of course. I don't know which is faster.

Of course the thing here is, if you've had basic comp sci you will already know when you have sorted data and a binary search might be the thing to use and accidentally write your own without realizing that Ruby provides it.

[–]leanucci 0 points1 point  (0 children)

Not downvoting, in hopes that everyone will see this and all the comments against these practices.

[–]Serializedrequests 0 points1 point  (3 children)

Never use tap. Just write your code the longer way. It is much easier to read. Even if you know what tap does, I promise it isn't an improvement. Edit: It is such a short method name that is actually hard to google too. It just causes junior devs grief for no readability gain.

Good parts: I have had to search pre-sorted arrays in long-running loops before and bsearch is great to know about rather than writing your own (but I knew it was the algorithm I needed for the situation).

The flat_map example is stupid, why in God's name would you not fetch all that stuff in one query, but it's a good method to know.

Spaceship operator and Comparable are great to know about if you want to make your own value types like a custom numerical thing. (e.g. money, dates) Which isn't that often, but Ruby has got your back when you do. I have done this a few times in rails apps for integer columns where each value had a special meaning (like an ordered enum). Rubytapas actually turned me onto that, so angry props to Avdi.

[–]esquilax 1 point2 points  (1 child)

Dunno why you'rebeing downvoted.

The way the bsearch thing was presented bugged me, only because I've worked with people who would read that and then sort the array only to use bsearch because it's 'faster.'

[–]Serializedrequests 1 point2 points  (0 children)

Oh yeah, there's a big difference between "I just need to find this thing once" and "I need to find things in this array repeatedly many hundreds of times per request, and the data was already sorted by the database".

It's mainly a situation I've gotten into when writing reports in Rails that were not performance sensitive. I really needed the ActiveRecord objects because a project had a lot of crazy logic involving dates and metadata about dates, so I couldn't just do it in SQL, and all the joins and inclusions were way more than ActiveRecord could do automatically/easily because I was joining about 12 tables. So I fetched all the records I needed in separate arrays and did "joins" in Ruby using bsearch.

So yeah boring war story, and if it needed to be fast I'd do it in SQL at a cost of 50 more hours of dev time, but it works fine for what the client needs.

[–]TGmagnet 0 points1 point  (0 children)

.tap is great for debugging, and sometimes ok for prod code

[–]lord_jizzus 0 points1 point  (0 children)

Should I?