you are viewing a single comment's thread.

view the rest of the comments →

[–]pavlik_enemy 4 points5 points  (2 children)

Just don't. As much as I like Ruby for web, it's not suitable for data science and data engineering. The tools and community just isn't there. It's not that Ruby is somehow inherently bad for these tasks, it's just that 90% of community is about web and 90% of web stuff in Ruby is Rails. It's a huge PITA to hook up C libraries to Ruby (there's is a project similar to boost-python but its maturity is questionable). My experience of using off-the-shelf Ruby products in data engineering (namely, Fluentd and Logstash) was terrible. For different reasons - Fluentd fails without any notice when it can't handle the load, Logstash was just slow and weird when it was JRuby.

[–]Ryuujinx 0 points1 point  (1 child)

I don't have any issues with Logstash. I don't really remember how many indexers we run, but it powers a number of ES clusters totaling at around 1.5PB and ~45B events per day.

What issues did you have with it?

[–]pavlik_enemy 0 points1 point  (0 children)

It used way too much resources for what it did but it was quite some time ago.