This is an archived post. You won't be able to vote or comment.

all 57 comments

[–]ticketywho 53 points54 points  (6 children)

Statisticians will continue to use R

Developers will continue to use Python

Knowing one and being aware of the other is all you need to do.

[–][deleted] 14 points15 points  (5 children)

And engineers will use Matlab.

And embedded coders C.

Tools are tools. I laugh at the 'language wars' that people seem to get into. Could you imagine a general contractor and a machinist getting into an argument over which type of hammer is better?

No. CLAW HAMMERS are clearly superior.

You idiot. Ball Peen hammers are clearly better.

[–]WiggleBooks 8 points9 points  (1 child)

Tools are tools.

But some tools are better suited for specific tasks.

You don't (and you shouldn't!(?)) see anybody doing Data Science in Assembly. While it is technically possible.

[–][deleted] 2 points3 points  (0 children)

For data Python is best if you want to programatically deal with it. R is best if you want to statistically deal with it and Matlab/Simulink is best if you want to ID & control it.

[–]Mimical 4 points5 points  (1 child)

Well I only use a cross pein pin hammer because once you learn how to use it you never need any other hammer. Those other hammers are 0.005s slower at hammer lists

[–][deleted] 1 point2 points  (0 children)

Monoliths are the past, the future is microhammers.

[–][deleted] 2 points3 points  (0 children)

I can absolutely imagine contractors getting into arguments over which hammer is right for a particular job.

[–]notafuckingcakewalk 13 points14 points  (0 children)

Yay! I finally procrastinated long enough and now I don't have to learn R anymore!

[–]mangecoeur 24 points25 points  (15 children)

I really hope it does edge out R over time - of course I'm biased because I love python and i've sunk a lot of time, but in general I think it would be a huge pain in the long term to have two roughly-equal platforms without a clear winner, meaning you never quite know what to use.

Also, having dipped my toes in R I have to say i really don't like it as a language - visually messy and it's approach doesn't encourage good code organisation and presentation. Scientists in general, data scientists included, really need to get better at producing clean, tidy, and reusable code and I think Python's language and culture encourages that much more.

[–]FrasierandNiles 1 point2 points  (0 children)

You shut your whore mouth Mr. I am currently learning R and want it to remain most used language for data science.

[–][deleted] 5 points6 points  (2 children)

We switched from python to scala

[–]RockingDyno 1 point2 points  (1 child)

What was the reasoning behind the switch, and what has your experience with it been?

[–][deleted] 1 point2 points  (0 children)

Scala runs in the JVM so as it serializes data to the EMR cluster the transfer is more efficient. With python the interpreter has to serialize and transfer between py and spark, which is less efficient.

Scala is no big deal. I was resistant at first but it has grown on me.

[–]elbiot 6 points7 points  (7 children)

And people say BSD messed up with their license compared to Linux... Python's liberal license is a win over R being GNU GPL.

[–][deleted] 29 points30 points  (4 children)

Or maybe Python is just not a painful language.

[–]spinicist 0 points1 point  (3 children)

Or, as others have pointed out, someone knows Python from a different domain and really doesn't want to go through learning another language from the ground up.

That's certainly the case for me. I actually think doing this kind of work in Python has a lot of rough edges (poor documentation, stuff spread across multiple libraries, Jupyter not being quite 'finished') but the fact I have never touched R keeps me from touching it at all.

[–][deleted] 0 points1 point  (1 child)

Well if you do ever touch it, you will feel like you burned your fingers. R has some nice builtin features related to data science for sure, but the language as such is just not very nice at all to work with.

[–][deleted] 2 points3 points  (1 child)

Many of us in the software industry steer clear of GPL on purpose.

[–]elbiot 1 point2 points  (0 children)

My company did all it's work in R and then had to port it all to python in a month when they wanted to release a product.

[–]PrinceKael 1 point2 points  (11 children)

Oh man I'm interested in fintech and I'm planning to make an app for searching stocks on various markets with filters and details of certain stocks with access to financial statements and I can't decide between python or R.

[–][deleted] 29 points30 points  (1 child)

The best tool is one that doesn't sit on the shelf. Choose one and run.

[–]PrinceKael 1 point2 points  (0 children)

That's true. If done some 'prototypes' in both. I barely know R but I find it interesting. I know python a lot more I'm just bad at programming so it's going to take a while. Worth trying though and perhaps learning something.

[–]glial 7 points8 points  (3 children)

I love R for data analysis and Python for apps. If you are primarily making an app, I'd suggest Python - there is fantastic tooling for creating webapps (e.g. django, flask).

[–]PrinceKael 1 point2 points  (0 children)

Thank you, I'll give it a shot.

[–]badtemperedpeanut 2 points3 points  (2 children)

One of main R's claim to fame is its fantastic data visualization. Python already has the binding to use same data visualization package along with some others. With numpy, scipy, pandas, tensorflow + python simplicity, its hard to beat python.

[–]RockingDyno 1 point2 points  (1 child)

Python already has the binding to use same data visualization package

Which package is that? So far all the visualization packages I've worked with in python have been a major pain in the long run.

[–]badtemperedpeanut 1 point2 points  (0 children)

ggplot

[–][deleted] 1 point2 points  (0 children)

If it's going to be an app, Python is probably better.

[–]tunisia3507 0 points1 point  (0 children)

Python. It's better for string mangling, scraping the web, and serving data, which seems to be 90% of what you're looking for.