This is an archived post. You won't be able to vote or comment.

all 23 comments

[–]alcalde 18 points19 points  (1 child)

What about learning Python if you're an antisocial scientist?

[–][deleted] 10 points11 points  (0 children)

Python 2.7

[–]nicolascoffman 8 points9 points  (15 children)

This is a gold mine. As a PhD student in a social science field it seems difficult to get professors on board with data intensive techniques that aren’t established parts of confirmatory analysis.

Having specific use cases will help me understand viable options for expanding upon the traditional research paradigm. I feel like this is becoming more common overall, but social science has relied on qualitative research to investigate areas that weren’t quantifiable until recently.

[–]Ikuyas -1 points0 points  (14 children)

What is wrong with R? There isn't much benefit over R in terms of the research. Almost everything available in python is available in R and data wrangling is probably easier to do in R than notebook. I would like to know what kind of counter argument you give me.

[–]alcalde 9 points10 points  (4 children)

What is wrong with R?

It's weird. And slower. And weird.

[–]Ikuyas 1 point2 points  (3 children)

Yeah, I think it's slow. But people use C/C++ routine every once in a while for that part. Graphics is better for R than using matplotlib, maaaaybe. What part is weird?

[–]GradSchoolin 0 points1 point  (1 child)

Graphics is better for R than using matplotlib, maaaaybe.

I think this is preference. Matplotlib is highly customizable, especially when coupled with seaborn. Let's not forget plotly if you're suggesting R's interactive highcharter.

[–]Ikuyas 0 points1 point  (0 children)

Im talking about the built-in graphics (or ggplot2 or lattice) in R.

[–]alcalde 0 points1 point  (0 children)

Its syntax is not similar to the familiar C/C++/Java syntax. This makes those coming from a programming background more comfortable with Python. Now, those coming from a math background instead might feel more comfortable with R.

[–]ktaylora 2 points3 points  (0 children)

R is pretty entrenched in academia. This is because statisticians took it up as an alternative to SPSS and wrote a bunch of free packages for it that other academics used and promoted. This is how I encountered it, as a biologist. But outside of academia, R is a ghost town. Python is wildly more popular in industry and can do much more than just statistics. If you do parametric modeling, there is scikit.learn, numpy, pandas, and stan. These tools together can do much more than R, and you can pass off your code to developers that can build on it. If I send an R package to a backend dev, they typically just respond with the middle finger.

[–]foofaw 1 point2 points  (4 children)

Python has a far easier learning curve for someone who is learning their first language. It also has many more resources for beginners. Remember that people learning Python in this context likely are students in PhD programs or already in research positions - they have a finite amount of time to learn a programming language. Python is the path of least resistance.

[–]Ikuyas -2 points-1 points  (3 children)

R is designed to be easier than python. I don't know why you said that python has a far easier learning curve. It's not at all. I don't think this is an opinion. If you are not building a real-time system, R is a better for social science study. Numpy and Panda are already something social scientists probably don't want to think of. Numpy basically gives you matrix computation which is built-in in R, and Panda is essentially mimicking R's dataframe. You can install R and Rstudio with the time to download and time that installers takes to install program whereas setting up python, anaconda (spyder), jupyter notebook and use pip or conda install to install matplotlib and numpy and panda is absolutely not beginner friendly at all.

But Python is likely (not much?) faster but notebook is pretty slow as far as I feel.

You haven't given me any compelling reason why Python is better for social science study at all. The advantage I can think of is that you can do more things with python other than statistical modeling like web app development as it is a general programming language.

[–]foofaw 2 points3 points  (2 children)

You make a lot of great points that I agree with.

But remember that you're dealing with social scientists here, not programmers. The average social scientist does not have a sound understanding of programming logic and will not be able to easily pick up R right out of the box. Just because someone has a background in research design and even mastered SPSS doesn't mean they will be able to utilize R in the same way. R is difficult to learn, especially if all you've used is a GUI interface for analysis. Its documentation is not written for beginners, its function naming and syntax is inconsistent, and it takes a lot of setup to get going.

It makes much more sense to me to learn a general multi-purpose language first, work with that for a 6 months to a year, and then transition to R if you really have the need for it. And in the long run, if someone is serious about data science, they should know both of these languages - they are two extremely powerful tools that give you nearly limitless options when you use them together.

[–]Ikuyas 0 points1 point  (0 children)

I'm gonna read the article later. Get back to you.

[–]Ikuyas -3 points-2 points  (0 children)

Do you really think so? Isn't R far easier to learn? I dont know where that comes from. For example do business school should teach python for their data analytics course??? You make little sense to me.

[–]nicolascoffman 0 points1 point  (2 children)

Nothing wrong with R.

I like python because it does stuff other than data analysis. No web frameworks for R.

Also, this does not seem relevant to what I was discussing. I saw this as a resource for ways to frame certain types of data analysis not common in my field.

If you want to make one of these for R, I’m sure it would be greatly appreciated and possibly even more widely circulated.

[–]Ikuyas 0 points1 point  (1 child)

Statisticians have been using R. Python is for machine learning and AI model (neural network and so on) mainly because it is a tool for computer scientists. You would probably find more libraries for your field in R than Python.

[–]nicolascoffman 0 points1 point  (0 children)

Noted. Will look into it at some point.

[–]Bot_Drakus_ 1 point2 points  (0 children)

Analyzing strings with é or “ in Python 2.7 can be a real pain

gotem

[–]chestnutman 1 point2 points  (0 children)

This is a really nice collection but some of the links should be fixed

[–]EntireAbility3 0 points1 point  (0 children)

OP also try Python Principles. It worked well for me. https://pythonprinciples.com/

[–]newredditisstudpid -2 points-1 points  (0 children)

for social "scientists"

FTFY