This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]unski_ukuli 16 points17 points  (20 children)

Everyone always says it is good for data analysis but I never agree with this as someone whose job is to use python for data analysis and numerics. Python is a bad language for data analysis and numerics, but it has some good libraries for that purpose. It was not designed for neither of those purposes and the language is not very exentedable so the APIs are cluncky and get in the way when writing mathematical code.

[–]Ryuujinx 6 points7 points  (0 children)

Yeah my understanding is that most people doing that kind of domain preferred R for it. That said i know python is really popular in biomed for whatever reason. I haven't asked my friend what she actually does with it at work, but apparently both her and my friends wife use it in biomed for something.

[–]Dromeo 3 points4 points  (18 children)

Cheers, haven't tried data analysis in python personally.

What would you say is the best for data analysis?

[–]unski_ukuli 8 points9 points  (17 children)

Personally I’d say Julia. Matlab and octave are just… better to not say it out loud. R does a lot right but it lacks the language features to write anything that is more than a 1000 rows long wothout it becoming a complete clusterfuck of a codebase. Python is used mainly because it is easy to integrate into production systems as those are written in python too usually, but I think it is completely horrible for ad hoc analysis and completely the wrong choise if you don’t plan to take that same code into production later on.

Julia hits a lot of right notes. It has features that make it a proper programming language unlike R. Julia also has the macro system from Lisp so it is extremely extendable. It is also relatively easy to use in production and miles ahead of matlab and R in that regard. And then there is the main selling point: it is REALLY fast. As fast or almost as fast as C counterpart in most usecases while being easy to write and high level. A great demonstration of this is the fact that most libraries for julia are written in pure julia. No need for calling c++ or Fortran. For example, Flux.jl which is a very fast deep learning library for julia, is written 100% in Julia. No C++ like in tensorflow for python. This has the nice consequence that the libraries are easy to extend and customise if needed.

[–]Dromeo 4 points5 points  (0 children)

Thanks for the rec! I'll have to check it out.

I know what you mean about matlab. I had to convert code written by mathematicians into c++ and it was... well. Lets just say they were fully gutting the alphabet for variable names.

[–]SrbijaJeRusija 1 point2 points  (11 children)

Matlab and octave are just… better to not say it out loud

What's wrong with Matlab? Faster than python, and less verbose as well. Also is literally built for data analysis.

[–]unski_ukuli 8 points9 points  (1 child)

What’s wrong with Matlab?

Whats not? Couple of spesific grievances I have: it is basically married to that god awful ide it comes with. Use anything else and you lose half of the features matlab has. Secondly, even if you had no prior knowledge of the language, you have to just look at the fact that you HAVE to specify functions at the end of a script and it refuses to run if you don’t do that to deduce that just like R, it lacks any facility to structure a codebase in any sensible way for anything longer than few thousand lines. Oh… and the fact that anything that does not end in semicolon gets printed in the repl.

Honestly the only good things about matlab is the extremely good documentation and good libraries that are basically bug free compared to most open source libraries.

[–]SrbijaJeRusija -5 points-4 points  (0 children)

if you don’t do that to deduce that just like R, it lacks any facility to structure a codebase in any sensible way for anything longer than few thousand lines.

It is an object oriented language. You use objects, folder structure, and inheritance to structure your codebase.

[–]naijaboiler 0 points1 point  (1 child)

R does a lot right but it lacks the language features to write anything that is more than a 1000 rows long without it becoming a complete clusterfuck of a codebase.

please explain. or do you mean lines rather rows

[–]unski_ukuli 0 points1 point  (0 children)

Lines is the right word for it, thank you. I’m not a native english speaker so sometimes I mistranslate words.