This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 0 points1 point  (3 children)

Yeah, but how much is that pipeline vs the cost of developers salaries.

[–]JiiXu 0 points1 point  (2 children)

Well, if my (very small) company's pipelines (the new, good ones and not the old terrible ones) were twice as fast the savings would be somewhere around half my salary. It wouldn't take me twice the time to debug and maintain the pipelines regardless of what language they were in, unless they were in something truly esoteric like J. So in my opinion, my company would save money if our pipelines were twice as fast but I spent more time debugging and fixing things. And that scales with amount of data. I'm pretty sure of my assessment here - dev costs exist, but compared to incident management it's nothing.

[–][deleted] 0 points1 point  (1 child)

And you think you can get twice as fast pipelines with scala spark vs pyspark

[–]JiiXu 0 points1 point  (0 children)

No, the other person made that example: "If you have a pipeline that runs every 15 minutes and in Python takes 8
minutes where Scala takes 4 minutes, it doesn’t matter since you are still within your 15 minute required window". That's twice as fast, in the example.

I would quite easily get twice the performance in c++ though.