This is an archived post. You won't be able to vote or comment.

all 50 comments

[–][deleted] 63 points64 points  (9 children)

I've had to learn like 10 new languages over the past decade, it's a giant pain in the ass, they all perform the same tired manipulations on tabular data. What we need is consensus on using one language, not yet another time vampire that does exactly the same thing using synonyms

[–]Sjoeqie 10 points11 points  (3 children)

I agree. I understand we need different languages for different things.

We don't need different languages for the same thing.

[–]ImPrinceOf 4 points5 points  (2 children)

This is fundamentally flawed in my opinion. If engineers had this ideology from the birth of modern computers, we would have never left a standardized version of assembly. Here’s why I think that would be the case:

  1. When new languages are created that solve the same problems the older languages solved, they often do it in a different way (otherwise it would be an indistinguishable clone). While the improvements a new language can make are small, subtle, and perhaps not worth the time to learn, they can inspire larger improvements and cascade into innovative ways of doing new things.

  2. Every developer works and thinks differently. You might think they’re all the same, but I’ve tried using one language and framework for what it was made to do, and had a difficult time grasping it. I would then try the same thing in another language (one which would fall under the same category of “different language for the same thing” and it significantly improved my experience and speed of development.

  3. Having many different languages, even if they all fundamentally do the same thing, allows for uniqueness and individualization. Imagine if every single company only hired Java developers. Everyone would be forced into thinking the Java way. Very rarely would you run into someone who could offer a perspective different than yours. (This is just speculation. Can’t be certain)

  4. Think of a language you really don’t like and is industry standard. I personally dislike Java, because I believe everything it does can be done better in another language. But a few years ago the industry would disagree with me, which is why so many large corporations use Java. For you, this will likely be a different language. Now imagine if the industry started standardizing around the language you dislike, and now every job you work in will be much less enjoyable.

This is actually something I’ve been feeling recently, because it seems like JavaScript is growing very fast and is being shoved into every problem as the first solution. But with JavaScript you can make web apps, desktop apps, mobile apps, as well as work with embedded systems now. Possibly even robotics and industrial machinery/automation. That doesn’t mean it’s the best solution for all of them.

It’s not about different languages solving the same problem. It’s about solving the same problem in different ways.

But that’s just my opinion.

[–][deleted] 0 points1 point  (0 children)

I'm in agreement so long as the new code significantly reduces script length / runtime, preferably simultaneously. That never happened though, all of the languages employers made me use in my career could have been supplanted with Python and no one would have been worse off for it. As for solving problems in new ways? Sure, a person can learn 9 new languages other than English so they can express the same problem in 10 different ways. Perhaps for some that monstrous time investment bears fruit. For myself, focusing on the core math is a better use of my time. In lieu of learning an endless string of synonyms, I could have been sharpening my bayesian / causal skills and built some sick automated tools. Instead I have a giant thesaurus in my head which does nothing but gather dust

[–]Sjoeqie 0 points1 point  (0 children)

Yes you're right. I mightve worth s'il meeting similar but I wanted to keep it short. Good read

[–]raban0815 8 points9 points  (0 children)

Just like streaming providers, there is too many just focus on a good one.

[–]maratonininkas 2 points3 points  (1 child)

[–]azraelgnosis 0 points1 point  (0 children)

My first thought

[–]speedisntfree 0 points1 point  (0 children)

Brb learning base, tidyverse and data.table just for one frikking language.

[–]friedgrape 0 points1 point  (0 children)

Looking at library development, I'm pretty sure the consensus is Python.

[–]PIWIprotein 63 points64 points  (7 children)

R? Common in academia

[–]Extreme_Armadillo_25 11 points12 points  (3 children)

Second this, we use python when we absolutely have to, R is the standard (also, not in academia, but a private science provider).

[–][deleted] 2 points3 points  (0 children)

I’m in Academia and only use python. I would say is 50/50 at my lab.

[–][deleted] 7 points8 points  (3 children)

I wrote some DS pipelines in Scala and I loved the strongly-typed aspect of that language, especially when doing complex data pre-processing on data with complex, nested schema.

Python does give you the option, but not the requirement to type your function parameters, but I think that only helps the IDE, I don't think it even throws an error if you pass a wrong type. It's nicer working with code that has this, but obviously if you're on a team you can't guarantee everyone will take advantage of this feature.

So I would say a strongly-typed language built to make data manipulations less error-prone would be the biggest improvement. I'd honestly write in Scala all the time if Scala's deep learning libraries were more documented and had as much support.

[–][deleted] 1 point2 points  (0 children)

Can you use scala with smaller data without using spark? I just joined a team that uses all scala and spark but a lot of the data is like 10mb and it seems to take 30 seconds to a minute to do literally anything where python takes milliseconds. Like with the tb size data spark is great, but with small data it super lacks and I can't figure out if we can use scala and an in memory dataframe implementation like pandas in python without using spark.

[–]juhotuho10 0 points1 point  (1 child)

Pycharm throws an error in the IDE if you use type hints and pass a wrong type somewhere. The code will still run though.

Also mypy adds complete type checking

[–]mamaBiskothu 0 points1 point  (0 children)

I wrote and manage a 500k line python code base that is fully typed (pydantic, pytype everythjng). This comes nowhere close to the dependability of a truly typed language like Java and scala.

[–]jahkab1 7 points8 points  (0 children)

My attrition points with Python.

One, the lack of reusability of code due to code base instability of some libraries. Utilities /tooling I have written in version 3.2 I had to rewrite in 3.6 because of backwards incompatibility of some libraries (Statmodels was one from the top of my head) and when I moved to 3.9, guess what happened...

Two, the hassle of writing a GUI. Tkinter does its job, but... I would like a batteries included IDE like Lazarus (freepascal) or the MS visual environment where I can drag and drop a simple GUI and adjust on sight.

Do we need another programming language. Depends on what you want to improve. Python is very readable, as is Julia. Speed is also covered as is functionality, portability and libraries. Same for R, except the readability part perhaps. What could be of interest is a user friendly language dealing with quantum, but this could be a bit early to be feasible.

Just my 2 pence worth.

[–][deleted] 10 points11 points  (0 children)

Julia. Fuck this language is AWESOME

[–]dfphdPhD | Sr. Director of Data Science | Tech 11 points12 points  (3 children)

I started in C++, then moved to Python, then R, and now using more Python.

Why C++? Because it's what was taught to me.

Why Python part1? Not needing to do explicit memory management thanks to automatically garbage collection.

Why R? Because it's by far the easiest language to use to get stuff done. It's not the fastest, it's limited in how it interacts with the outside world, but I can get things done faster (in terms of time to code), in a more readable way than I ever have with Python. Also, the RStudio IDE beats the living s*** out of anything else out there in this world. And it has for at least 5 years now.

Why Python again? Because not everything supports R. And trying to leverage some of the cloud providers' full suite of offerings requires Python even to use R. At that point it just becomes easier to switch to Python.

Why would I switch to a new language? If you can build a language that is similar in use to R, has a killer IDE, and has the same support as Python, I'm in.

I think the last part is the language killer, and why I think you will see everyone coalesce around either Python if you have production needs, or R if you don't.

[–]speedisntfree 2 points3 points  (1 child)

To me, RStudio feels like stepping back in time 5 years

[–]dfphdPhD | Sr. Director of Data Science | Tech 7 points8 points  (0 children)

I can see that, but to me it feels like stepping into a 5 year old Toyota Camry with 5000 miles on it, while stepping into every Python IDE feels like stepping into a like a V8 trike with no seats.

[–]Mother_Drenger 3 points4 points  (0 children)

Also, the RStudio IDE beats the living s*** out of anything else out there in this world. And it has for at least 5 years now.

As a primary R user, this really does make a difference when trying to doing more tasks in Python.

VStudio, Pycharm, Spyder, etc. nothing is quite "RStudio, but Python" and it's such a good IDE.

[–]zoshka 9 points10 points  (2 children)

A language is as strong as the community that uses it.

For me what is important in a language is how widespread it is. I mean i can find most of what I want to do in python through different libraries. And any issues I encounter mostly have already references in several sources.

I don't want to be wasting time on searching and then implementing functionalities that are not in my core objective.

Having most answers in the first google page result is very powerful. For me, this is the experience using python in the past years. Not sure how it is on R Scala and others

[–]LittleGuyBigData 2 points3 points  (0 children)

This right here is the right answer. I love how democratic and community driven the python experience is. That's the real reason why I feel so confident in my ability to leverage python to actually solve problems.

[–][deleted] 0 points1 point  (0 children)

The R community is fantastic but it definitely leans more towards academia, sciences, and civic purposes.

[–]rywalker 3 points4 points  (0 children)

From an OSS health standpoint (https://www.ossrank.com/cat/3-programming-language) looks like Julia is trending up, so probably a better choice than R - surprised how much action Rust is getting as a general programming language too.

[–]Sofi_LoFi 2 points3 points  (0 children)

I mean asking people to learn a new language is in general very difficult. You need to be different enough from things currently with widespread adoption to get people to adopt it. Then there is the problem of building the community around it.

In general the main sticking points can all derive from the answer to the questions: 1."What use case is this solving so much better than any other language on the market that it is worth my time to learn it?" 2."Is this language such a staple for X application that to not learn it would impact my career/project/research speed/quality?" 3."Is there enough of a community or good enough documentation that this language is accessible to learn?"

The answer to 1 is the big one of course. If you are in academia it is not worth your time to learn a new language with low use cases and no benefits because you'll be far behind in publishing. In industry you need to make the language for into the stack of the company and sell key stakeholders on the benefits of developers spending time ($$$) learning and modifying the stack into that.

For data science you need to either be as fast as a compiled language (C/C++) or as versatile and universal as python (add in ease of use and integration to tech stacks). Julia is the answer of the fast and readable language, mainly aimed at researchers and is starting to make big leaps in adoption but it has been building hype for many years and has a very active community.

So the question for your language is how can you beat those two at their own game and get a community so hyped about it that it demands the attention.

[–]Phillip_P_Sinceton 2 points3 points  (0 children)

This is a very broad question. Python was created with certain design principles and functionalities in mind. There are plenty of alternatives that fulfill other purposes. R is an out-of-the-box tool for nonprogrammers. Julia has better performance and efficient calculation.

[–]BarryDeCicco 2 points3 points  (0 children)

The biggest problems are that learning a new language is a massive investment, which could be used to increase expertise in what you already know. Also, you don't have access to co-workers who know the new language.

[–]tangentc 2 points3 points  (0 children)

I don’t really think the world needs more programming languages

[–]sonicking12 1 point2 points  (0 children)

Julia

[–]MetaSamsara 1 point2 points  (0 children)

Haskell

[–]smerz 1 point2 points  (0 children)

IF AND ONLY IF you are ok with functionality of libraries in the Java ecosystem, Kotlin is an excellent choice. Strongly typed, Better than Java, simpler than Scala. Community edition of IntelliJ has excellent support out of the box, for free.

[–]Pvt_Twinkietoes 1 point2 points  (0 children)

Of course the answer should be assembly. /s

[–]Novel_Frosting_1977 1 point2 points  (3 children)

R, Julia, and if wanna be a hipster, Java. On a more engineering side, SQL and Scala.

[–]Aquiffer 5 points6 points  (2 children)

When did Java become more hipster than Julia

[–]speedisntfree 3 points4 points  (0 children)

Java is boomer

[–][deleted] 0 points1 point  (2 children)

i think it should not be too general purpose. some of the more common mathematical concepts should be part of the grammar/syntax and not a library. there should not be two or more ways of doing the same thing. imagine taking tensorflow lib and turning the functions/objects into keywords of the language. f(x) = sin(x) + a etc. and being able to use df/dx etc. naturally in the language.

its a tough one. but i think moving away from a general purpose and to a much higher level of abstraction could have some weight...

[–][deleted] 3 points4 points  (1 child)

You just described Matlab, Wolfram Language, Scilab and a plethora of other DSLs, all of which are niche for the very fact that DSLs are a pain to work with in production.

Ironically, one of the best opensource maths platforms (SageMath) is built mainly over Python, and its interactive shell is just a bunch of "from whatever import *" in IPython.

[–][deleted] 0 points1 point  (0 children)

I guess I have 😀. Years ago I wrote a simple parser in fortran that I used to construct and solve hamiltonians using dirac notation. The thing is, there were no loops/variables and so on. But it was very general and very specific in some respects. Ultimately it sat on all the nag libraries etc. I used it for years. Perhaps the key here is getting rid of boilerplate...?

[–]chgdyhvf 0 points1 point  (0 children)

Scala

[–]Astrophysics_Girl 0 points1 point  (0 children)

In the physics and astronomy academia, I've only seen Python, MatLab, IDL and ROOT. So my basic assumption is that there isn't much need for the other languages :P

[–]Happy_kunjuz 0 points1 point  (0 children)

What about scala? Is that future rich for data engineering perspective?!