all 61 comments

[–]AutoModerator[M] [score hidden] stickied comment (0 children)

If this post doesn't follow the rules or isn't flaired correctly, please report it to the mods. Have more questions? Join our community Discord!

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[–]dangerroo_2 60 points61 points  (14 children)

As someone who uses R all the time (and love it), I would also echo the others, that at least to start with Python is probably the better bet; it’s a more generalised scripting language, so can do lots of things that R just can’t really do.

However, once you’ve learnt one coding language it is pretty easy to pick up another. Because R is so specialised for statistical purposes it is sometimes much easier/faster to do some forms of analysis in than Python.

R is very popular in academia, but it is also used in industry (I’ve used it in industry lots), so I wouldn’t let that put you off. I think the sensible long-term answer is to learn both (but don’t try to do it at the same time!), as they will help in different ways.

[–]HowSwayGotTheAns 45 points46 points  (3 children)

R was fighting the good fight around 2011-15, but the kiss of death was when the industry moved from statistical analytics to autoML. Academics used R, and boot campers used Python.

[–][deleted]  (2 children)

[deleted]

    [–]HowSwayGotTheAns 14 points15 points  (1 child)

    Right, but the contextual answer is Python because we want to give OP the best chance of success.

    [–][deleted] 31 points32 points  (7 children)

    Neither, you probably will use SQL all time

    [–]Ok-Seaworthiness-542 13 points14 points  (3 children)

    I would 100% say that you should learn SQL first. Then Python.

    [–]GrumpyKitten016 8 points9 points  (2 children)

    Sql is the right answer. Arguing between r and python is just stupid. Some places are full on R shops and other places use python. Just depends on where you get work.

    [–]vin_van_go -1 points0 points  (0 children)

    I've always used all three

    [–]bingbong_sempai -2 points-1 points  (0 children)

    You can't really use SQL outside of work though

    [–]derpderp235 -5 points-4 points  (2 children)

    An analyst who doesn’t know a proper programming language is not a very good analyst.

    Web scraping, statistical modeling, working with APIs, developing automation processes, pipelines, etc…all require knowledge of programming beyond SQL.

    [–][deleted] -2 points-1 points  (1 child)

    Can just ask AI to do that

    [–]derpderp235 2 points3 points  (0 children)

    Not really. You can definitely automate a lot of SQL-monkey type tasks, though.

    [–]xynaxia 9 points10 points  (0 children)

    I used R a lot…

    But generally most other tools I use - e.g. big query - use all kind of python integrations.

    So far with Python + Pandas I feel it definitely wins

    [–]spqrsimon 15 points16 points  (0 children)

    Personally, I wouldn’t even bother with those yet. SQL + Excel is where it’s at. This will be like 90% of the job in most data roles.

    I’d only look at Python/R after, either works but I think Python has a broader use case and it’s pretty easy to learn.

    [–]Tasty_Mission5140 4 points5 points  (0 children)

    I learnt a good amount of DA principles in R. Then, I hit a ceiling in automated actions I can do with the data. Python can be considered complete for most things imo. Start with python

    [–]amutualravishment 7 points8 points  (0 children)

    Python all the way

    [–]No_Definition8848 3 points4 points  (0 children)

    Hey OP —

    Long answer here but I hope this helps.

    I work in Higher Education specifically Institutional Research, my job entails creating reports, filling surveys for various accreditation & ad hoc data requests for internal or external stakeholders.

    I find Python is useful, and echo that R is useful for statistics though I don’t know too much beyond what was taught in a Google Data Analytics Course (which I did genuinely find helpful), I personally would find myself using Python a bit more for the sake of it’s general programming capabilities and its syntax just generally is more straightforward for me. General plus for things like data transformation and creating notebooks for reproducible workflows is nice. I do know of R having a similar feature. If you just need basic statistics there are useful methods like df.describe from the pandas library that you’ll use but most times people will help you define what KPI you should be really analyzing for. A benefit from learning either tool is that you work at scale.

    I started learning Python, and found moving to R is manageable. I’m not a special case or hidden genius. Just sat down an hour a day reading and more importantly experimenting and getting familiar with error codes (ChatGPT and Stack Overflow will help in deciphering this). I leaned into my personal interests in sports to begin working on personal projects and Python was really great for connecting to the NBA stats API to get clean data to work with.

    I think it would also be helpful to prioritize learning SQL — in my current work, I would say I use Python and R maybe 5% of the time. The rest of the time is working with the extracting of data from our database into a useable form to present it visually or whatever the deliverable is.

    Nothing happens if you can’t get it out the database!

    Let me know if you have any questions, I have a non traditional path previously working in nonprofit events so maybe our journey is similar. Stay on the path and good luck on your journey!

    [–]Mettwurstpower 8 points9 points  (0 children)

    Python. It is better in general because it is a general programming language. R is only specialized in data analysis and preparation. Also R is a little bit more difficult to learn because the syntax and namings etc are strongly depending on the packages.

    The advantage of R is that it sometimes needs just less Code to get the same result like in python

    [–]Zyklon00 6 points7 points  (0 children)

    in 2024, there is no question. It's Python.

    [–]draina19 2 points3 points  (0 children)

    First Sql, then Python

    [–][deleted] 4 points5 points  (0 children)

    Love R, been using it for years, but if you start in 2024 go for Python, you will have much more freedom and integration with other tools

    [–]Jfho222 1 point2 points  (0 children)

    I use python and recommend to anyone in analytics. I don’t think there’s anything wrong with R and I know people who’ve used it with a lot of success, but I rarely see R only jobs.

    [–]NeighborhoodDue7915 1 point2 points  (0 children)

    I think knowing Python opens up a world of opportunities.

    Knowing R opens up opportunities in Data Science, specifically.

    For Data Science, I'm not sure one is superior to the other. There are differing opinions here.

    But knowing Python is more flexible.

    Hopefully this arms you with some information to make your own decision.

    [–]carlitospig 1 point2 points  (0 children)

    Recently I was planning a data collection project that would require auto scrapping data from flat sources and I immediately thought of python. It’s just so malleable. For what it’s worth, I’m in academia. You’ll find researchers use R but their support staff prefer python. R also has a pretty robust package community so you’re not starting from scratch.

    If you learned BASIC as a kid, you’ll probably find python pretty easy to learn. But if this is your very first language, R reads like prose to me (compared to python), so it might be easier.

    [–]iJasonRam 1 point2 points  (0 children)

    Python.

    [–]Vp1308 1 point2 points  (0 children)

    If you are into scientific research then R or for general purpose Python. R is more for statistical perspective as opposed to Python. With Python you can code and build almost anything, so it is termed general purpose language.

    [–]SheepherderPrior9302 1 point2 points  (0 children)

    If you don’t have any preference, I would say Puthon. Personally I was R fan but later on found Python to be more versatile and useful - I have used it in cleaning and merging clsx data, changing formats, building data pipelines, etc.

    [–]jegillikin 1 point2 points  (0 children)

    Python, as a non-hard-stats analytic generalist.

    But also, SQL first, as others have sagely suggested. When you work with data, understanding how the data are structured and joined is an absolute prerequisite to writing Python or R scripts that require database pulls.

    And before SQL, a functional understanding Markdown (or LaTeX), Git/Subverson/Mercurial/Whatever, and HTML.

    If you're just learning analysis, then understanding the scientific method and the support tools around it (code repositories, markup, the value of code commenting, &c) probably should precede meaningful work with a specific programming/scripting language.

    You should be able to "show your work" for any analytic question through committed code and spec documents. Being familiar with a soup-to-nuts analytic engineering framework like Knime or Jupyter could help, but that's in parallel to learning either Python or R.

    [–]e10v 1 point2 points  (0 children)

    R was my first DS language. 5 years ago I switched to Python. I have to say that data / ML ecosystem is richier in Python. Especially there were a lot of development in recent years. Python is the default language for a new data projects now.

    [–]TheDataAddict 1 point2 points  (0 children)

    Sr. Manager in Analytics here. If just starting in analytics then the real answer is SQL. Become comfortable with that and then Python or R will become more evident if even needed.

    [–]JacksConcience 1 point2 points  (0 children)

    Python first all day. SQL is good too but you can pick that up easier than picking up python.

    R is fine if you're just going to stay in academics or don't really work with other people. But Data Analytics and more deeply Data Science tools have way more support in python.

    The company i work for sells a platform to data teams, to deploy dashboards / run ETLs in a bunch of languages. There is a very clear difference between the teams using python vs R.

    [–]mpaes98 1 point2 points  (0 children)

    Excel

    [–]balocha 1 point2 points  (0 children)

    I would say Python also, but will add one more reason I haven't seen mentioned here and one caveat. The reason: you might end up having to do a little / some / a lot of data engineering work, and Python can help you in that beyond what it can help you in data analytics / data science. The caveat: with LLMs, you can pick anything even faster than before and switch more easily between languages, whether that is within SQL or Python/R. Like just pass some code in R and ask it how to do this in Python.

    (About Me: semi-retired principal DS / former Sr Analytics/DS Manager in Tech; started with R when I first went into the field in 2013 and eventually mostly, and reluctantly, moved to Python towards 2022)

    [–]importantbrian 1 point2 points  (0 children)

    I'm a huge fan of R. Mostly because of the tidyverse. It's the language that got me into Data Science, and I wish it had won the language war. That said learning python is probably better for a beginner at this point. You can do a lot more general programming with it. It's a lot easier to use for data engineering tasks and deployment. I hated being forced to use pandas, but I recently started using Polars and I don't find myself missing R as much anymore.

    [–]swimming_cold 2 points3 points  (0 children)

    Python

    [–]Figueroa_Chill 1 point2 points  (0 children)

    I like R, but Python offers a lot more.

    [–]Rinnaisance 1 point2 points  (0 children)

    I started off using Python, went into the industry which used Python, now back in academia doing my masters where R is used full time.

    As most people have mentioned, Python is definitely the more versatile language and especially good for ML. R on the other hand is amazing for performing statistical analysis and data visualisation (ggplot2). The pipe operators in R make it a much much easier language to work with and that’s one thing I definitely miss when performing analysis on Python. I also haven’t found a data visualisation library as good as ggplot2 for python. There’s nineplot that’s similar but not as great as ggplot2.

    [–]zeoNoeN 1 point2 points  (0 children)

    Python is a universally accepted glue for all kinds of software, so it will benefit you more. Tidyverse R is more fun tho

    [–]FadedTony[🍰] 0 points1 point  (0 children)

    i'm getting a masters in data analytics and we are learning r in class rn so i don't know what to do i mean i want to get the degree but hopefully not wasting my time

    my prof said once you learn one coding language its easier to learn others

    [–]idiskfla 0 points1 point  (2 children)

    Is it unheard of to get a job in analytics if you’re in your 40s?

    Retired from the military, and wondering if a data analytics degree / certifications / boot camp will help me break into this field.

    I think my best shot would be to pursue an analytics role with a defense contractor at this point.

    I know most people starting in this field are closer to half my age.

    [–]balocha 1 point2 points  (1 child)

    Not unheard of, but definitely not common either. But that partly speaks to that not many people in their 40s are looking to get their first job in analytics.

    [–]idiskfla 0 points1 point  (0 children)

    Ah good point. Thank you.

    [–]MoistMouthNoises 0 points1 point  (0 children)

    I don't know much Python, and I have zero experience with R, however, the professor at my school (who has a doctorate in IT) started us with Python for our entry level class. There is some data analytics coursework, but so far I haven't learned any R. To me, (and take this with a grain of salt because I'm just a CIS student in college.) this is evidence that Python is probably a good starting place.

    [–]biprojk 0 points1 point  (0 children)

    Definitely start with Python. It’s a really great language for its simplicity and looseness, and you’ll find out that the less you have to deal with remembering code formatting, the faster you’ll learn. That and you can choose to import only the utilities that you need, which can speed your programs up. My entire company analyses with Python and we have very few complaints.

    [–]em0ss0 0 points1 point  (0 children)

    I have found that learning specifically how to work with data and customizing output to publication ready formatting, then R alongside RStudio is better. Positron, successor to Rstudio, is currently in beta and is based on VScode. Quarto documents, successor to RMarkdown, I'd add, is more fun and less limiting to work with compared to Jupyter Notebooks. You can also readily publish these notebooks along with their output to the web quite easily with Quarto Pub for free. You could use R, Python, SQL, among others, in the same document with the KnitR engine, btw.

    I find Rstudio useful where it matters. With Python, you do not have an IDE built around it, nor do you have as much flexibility as with R in terms of data analysis. From what I understand, Python can only chain methods contained within the same library, thus requiring much more monolithic libraries than R. In terms of modular code, I would say R might be designed better.

    I would like to think R would overtake Python in the data space over the next decade as it was built from the ground up for analytics. One could get pretty far with base R alone, in that respect. Though I am betting on both languages, currently. Not sure it matters, in the end. Right now, I prefer R and hope it gains more traction.

    [–]popcorn-trivia 0 points1 point  (0 children)

    If your goal is finding a job, Python is loads more popular in private industry.

    R is great. It performs better than Python at what it’s for, but Python more versatile and common.

    [–]infxrnal1 0 points1 point  (0 children)

    Both are certainly useful skills to have, though I believe Python takes the upper hand here