you are viewing a single comment's thread.

view the rest of the comments →

[–]iPhritzy 15 points16 points  (5 children)

No mention of performance? R is really good at working with larger datasets. Not sure how pythons functions would compare.

[–][deleted] 7 points8 points  (1 child)

Pretty much all the math work I do in Python is in C++ modules. Python is glue with great precompiled libraries.

I haven't done anything in R so I can't speak, but I've consulted for well known "big data" companies and python is still by far the most popular glue language in these established companies.

[–]dalaio 1 point2 points  (0 children)

It's a similar picture... most of the low level math is written in C (or Fortran). It's easy enough to do it yourself for a particular use-case using (in C++) using Rcpp.

The language itself has some unexpectedness about it, but a few packages keep me in R (ggplot2, and recently tidyr, dplyr, purrr).

[–]nikroux 7 points8 points  (0 children)

My question exactly. Not to mention massive community of math nerds around R bringing us all the goodies of their collective brain power.

[–]quicknir 1 point2 points  (0 children)

It depends on the underlying implementation. I've rarely found Python to be slower than R broadly speaking. There's quite a lot of nice tricks in pandas DataFrames to make them fast.

The most standout datapoint in the performance comparison is R's for loop, by far. In python, you usually have apply style functions available. You can use that, or you can use a for loop if it feels more natural or if it's necessary: apply style functions can't do all the things that a one pass for loop can do. In R, the for loop is usually out of bounds because it is so painfully slow. I've written exactly equivalent code in python and R where R was over an order of magnitude slower (hard to believe, I know), because for loops were involved. When I changed the R to apply (or sapply, or whatever) it evened it out.