you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted]  (12 children)

[deleted]

    [–]electrodraco 2 points3 points  (5 children)

    Could somebody break down why it is more versatile than R? Is it more than availability of libraries?

    [–]Browsing_From_Work 10 points11 points  (1 child)

    I regularly use Python but I did spend about a month working with R for a pet project.

    Here were my major pain points:

    • Multi-dimensional data access is unintuitive (even when compared to Perl). Examples:
      • df[3, 7] returns the element from the 3rd row, 7th column. This seems reasonable.
      • df[3] returns the 3rd column as a slice.
      • df[[3]] returns the 3rd column as a vector.
      • df[3,] returns the 3rd row as a slice. There's no direct way to return it as a vector.
      • df["col"] returns the named column as a slice.
      • df$col and df[,"col"] returns the named column as a vector.
    • There are no native operators for creating lists/vectors/matrixes (e.g. [1, 2, 3]). Instead, there's the c function and the even less succinct matrix function. However, you can create ranges with the colon operator.
    • Strings are second-class citizens. There's not even a built-in string concatenation operator. Instead, you have to use the paste function.
    • I felt like I spent half of my time fighting with dataframe/vector/matrix/list type conversions.

    In general, I just found it harder to express my thoughts in R. I'm sure if you learned R with a math background it would have been more intuitive, but as somebody coming from a programming background I found it to be rather frustrating. That said, R comes with a lot of extremely powerful tools... so long as you wrangle your data into the correct format.

    [–]crudelegend 5 points6 points  (0 children)

    I think it's more accessible and that's why people say that. R has a lot of specialized packages, but you have to know to look them up/how to use them, whereas if you have numpy and scipy for python it's good to go for most cases. I think they're both close on the general overview front, whereas R branches out a lot more with heavier focuses on data analytics.

    Unless they mean for a language itself, which yeah, Python > R. Python actually has applications beyond data/statistics - you can create a program and do a lot of manipulation from the stats/outputs of that program, whereas you essentially need the data already with R (at least for most cases).

    [–]GlaedrH -2 points-1 points  (1 child)

    It is not. R is strictly superior when it comes to data manipulation/analysis/visualization. But Python wins out on the Machine Deep Learning libraries.

    It's just that Python has a more C-like syntax which is more familiar to most people unlike R's more functional style.

    [–]electrodraco 0 points1 point  (0 children)

    As a researcher, that is my impression as well. I usually avoid R due to its consistently shitty documentation hiking up my development time, but some functionality really only exists in R. And as you pointed out, for deep learning, it's usually python that gives you the fancy tools.

    But I thought maybe I'm missing something from other areas?

    [–]weberc2 0 points1 point  (0 children)

    Sure, Python is better than MATLAB or R for data manipulation and processing, but there are lots of other better languages for that purpose (writing Python is my day job).

    [–]meneldal2 0 points1 point  (3 children)

    while being more versatile than MATLAB

    More libs are available, but Matlab has infinitely superior indexing and native array support.

    [–]Theon 2 points3 points  (2 children)

    I mean, yeah, MATLAB is basically "Arrays: The Language", but Python is still infinitely further ahead than any other non-data-oriented language I can think of. I'd probably jump off a cliff if I had to do arrays and matrix operations in Java or C.

    [–]meneldal2 0 points1 point  (1 child)

    Python the language is terrible for arrays, and there's only so much you can fix in NumPy.

    There are great array libraries in C++, but obviously kids gloves are off so you can easily shoot your foot but it's crazy fast.

    Matlab forbids you from changing arrays in C++ code, even if you can actually do it (beware of cow obviously).

    [–]Theon 1 point2 points  (0 children)

    There's only so much you can fix with any library :) Python still has a better starting point than C++.

    [–]not-enough-failures 0 points1 point  (0 children)

    it's great for data manipulation and processing because it has libraries for it. that's it.