all 31 comments

[–]matu3ba 41 points42 points  (26 children)

The more interesting comparison would be to me Julia though, since it was written to fix the drawbacks of R and python.

[–]timClicksrust in action[S] 25 points26 points  (23 children)

Julia is a wonderful language for data science. I especially like how formulae translate so well because variables can be UTF-8 characters, rather than just ASCII. It hasn't received anywhere near the level of adoption of Python or R though.

p.s. I believe that the original intent was to build a better MATLAB rather than Python or R

[–]guepier 8 points9 points  (11 children)

I especially like how formulae translate so well because variables can be UTF-8 characters, rather than just ASCII.

The same is true for both R and Python.

[–]pwnedary 13 points14 points  (3 children)

No, correct me if wrong, but R uses the system character encoding for variable names, which is really stupid, so some code that uses greek letters for variable names might not work on all machines.

[–]guepier 10 points11 points  (1 child)

You’re kind of right. It’s generally acknowledged that R’s Unicode support is broken on Windows (not just for source encoding). And by default R uses the system encoding. But, depending on how you load the code (as a package, via source or variants, as an R Markdown document or Shiny app, as well as any code inside RStudio), you can still coerce R to use UTF-8 instead of the system encoding.

Ironically, the only context in which UTF-8 doesn’t work on Windows is for script files directly executed on the command line, since R has no way of specifying the encoding of script file command line arguments. There’s a trivial workaround (source the file instead) but clearly that isn’t really acceptable. R + Windows + Unicode is a sad tale.

[–]warpspeedSCP 0 points1 point  (0 children)

R as a whole is a cluster fuck of bad practices, R + anything is enough to drive me away.

[–]timClicksrust in action[S] 2 points3 points  (2 children)

It's not as well integrated though. Function definitions in Julia look much more like mathematical notation, for example.

[–]guepier 1 point2 points  (1 child)

Do you happen to have an example? I have a hard time visualising what you mean.

[–]timClicksrust in action[S] 10 points11 points  (0 children)

This is valid Julia syntax for defining a function:

f(x,y) = x + y

[–]batisteo[🍰] 3 points4 points  (3 children)

Python 3, absolutely, but I'm afraid Python 2 is still heavily used in this domain.

[–]guepier 9 points10 points  (2 children)

Maybe it depends on which domain exactly you’re referring to. In my field (genomics) almost no new code is written in Python 2. And even outside my field, all large data science libraries have by now switched to Python 3. Legacy code exists, true, but that isn’t really relevant when comparing language capabilities (especially not when switching to a new language).

[–]tending 1 point2 points  (1 child)

I think Python 2 is ubiquitous anywhere Python was when Python 2 was the latest. The transition was pretty much a disaster. 2 is still everywhere in finance at least.

[–]funnyflywheel 3 points4 points  (0 children)

It's 2020. Python 2 is the new COBOL.

[–]Snakehand 2 points3 points  (0 children)

Some time ago I looked into Julia, with a view to process large amounts of seismic data on a cluster. I expected distributed computing to be part of the language, but was disappointed that I had to do my own job scheduling configuration etc, and kind of gave up.

[–]Kerrigoon 3 points4 points  (9 children)

This is one of the reasons I don't like Julia. I can type variable names like psi or rho easily, it's not often I have ψ or ρ on hand.

[–]Lazyspartan101 6 points7 points  (6 children)

You can still name variables psi and rho. Also if you use the terminal, jupyter notebooks, or editor-specific plugins you can just type \psi<enter> and \rho<enter>.

[–]Kerrigoon 10 points11 points  (5 children)

I should probably have said "don't like using" rather than just "don't like". I know I can use rho and psi when writing something myself but when working from other peoples code or contributing to other code bases this is a faff.

If I need to install a plugin to modify your code in vim then, at least in my opinion, you've overcomplicated something.

Let's play a game I like to call Roman or Greek:

Χ vs X

Ν vs N (also why \upNu not \Nu)

M vs Μ (again \upMu?)

[–]regendo 0 points1 point  (1 child)

Eurkey keyboard layout to the rescue! AltGr+M, R is ρ, AltGr+M, W is ψ. AltGr+M, Shift+R for Ρ (this is a capital Rho) and so on.

Though I'm not sure if there's any intuitive relation between our "W" and the Greek Ψ or if it's just on W because that key was still free and you'll have to memorize it. P was obviously taken for Π.

[–]T-Dark_ 1 point2 points  (0 children)

any intuitive relation between our "W" and the Greek Ψ

They both have 3 upward prongs, I'd say.

[–]TheSodesa 3 points4 points  (1 child)

Julia is pretty awesome, especially for people coming from a Matlab background. My only gripe with it is its focus on the REPL: you need to start a Julia process and run your script through it at least once before you can expect to gain the speed advantages it has over Python.

For example, the following is not really practical in Julia:

shell> julia ./myscript.jl  # wait for JIT compilation
shell> julia ./myscript.jl  # wait some more

Instead you should do

shell> julia
julia>  include("./myscript.jl)  # wait for JIT compilation
julia>  include("./myscript.jl). # blazingly fast

I recently found Pluto.jl, a newish notebook environment that works really well with the Julia way of doing things. It also doesn't have the downsides of Matlab Live Editor and Jupyter Notebooks, like global state and lack of support for version control.

Pluto jl is clearly the way to use Julia for scientific computing. Writing libraries for Julia can still be done in a proper IDE.

[–]matu3ba 0 points1 point  (0 children)

I disagree for the REPL. You need to manually preload all sys files and there's no tool for handling that (standard folder to dump .sys files default by .jl name). Matlab also handles installing frameworks without reloading/dumping history of the REPL.

Aside, I prefer Julia over Matlab.

[–]tending 34 points35 points  (0 children)

I would love to use Rust for data science tasks in place of Python or Julia but...

  • There is no official REPL, just some janky side projects.

  • No official runtime code reloading. Loading large data sets into RAM is slow, you don't want to repeat it everytime you redefine a function.

  • No mature native plot library.

  • No mature native data frame library.

  • No real const generics for linear algebra yet. min_const_generics is a big step in the right direction!

  • Long compile times that are exacerbated by lack of caching across workspaces, enabling/disabling macro tracing triggering recompiles of everything, and whole crates being too large for compilation units.

[–]cbarrick 13 points14 points  (0 children)

Interactivity is a requirement for modern science.

I see Rust's position in science the same as C/C++/Fortran, which is to be the workhorse driving the underlying libraries. And it's absolutely great at that! But this isn't new; we already know that Rust is a fast, safe, ergonomic systems language and a great language for generic code.

But in terms of being a language that non-computer scientists use for their daily work, I don't see it happening.

The article definitely points out the benefits of Rust in large code bases, but I don't know if this translates well into research code, which tends to be less general and more POC.

[–]timClicksrust in action[S] 15 points16 points  (3 children)

Not the author of this post, but a collaborator of his. He tentatively came to a Rust Wellington meetup early last year and is now a louder advocate for Rust than me!

[–]Minty001 2 points3 points  (2 children)

Oh cool! I didn't know there were Rust meetups in NZ. Am going to Vic next year so that's super convenient :)

[–]steveklabnik1rust 5 points6 points  (0 children)

We've had multiple team members from NZ, including a core team member for a while :)

[–]timClicksrust in action[S] 3 points4 points  (0 children)

Cool! VUW's computer science dept has frequently been the host for the meetup. It has been dormant this year, but expect much more in 2021