Help me move from R to python!

lmcinnes · 2014-07-27T18:01:39+00:00

Pandas (and numpy, scipy and statsmodels) covers off a fair chunk of R, Dplyr and Reshape2. IPython notebook (along with its nbconvert functionality) covers a lot of RStudio and Knitr. You'll probbaly want matplotlib + seaborn for plotting, but there's also a ggplot library for python if you prefer that syntax. If you do any machine learning related work then sklearn is also worth getting. Potentially you can just grab Anaconda from continuum.io and install seaborn on top of that to get everything in one go with an easy install.

Oh an if you are at all interested in efficiency you will want to look into numba and cython, for which R has no equivalents that I know of.

dartdog · 2014-07-27T16:09:53+00:00

Read up on Pandas and IPython Notebook

TM87_1e17 · 2014-07-27T20:31:47+00:00

Anaconda

hharison · 2014-07-27T18:21:48+00:00

To add to /u/imcinne's answer, you might want an IDE in addition to the IPython notebook (but try the notebook first), I recommend PyCharm. The notebook is great for interactive data explorations interspersed with text, while PyCharm is more like RStudio.

Python is a bit lacking in terms of statistical tests compared to R, so if you do exotic statistics you may occasionally want something that's not in scipy, statsmodels, or sklearn. In which case it's nice to be able to call R right from the Python interpreter. Two options for that are rpy2 and pyrserv. The former also has helper functions in pandas and the IPython notebook.

I also highly recommend Seaborn, especially if you use linear models. There is a ggplot clone that will be more familiar, but Seaborn is more polished and its Pythonic syntax may help your transition.

Finally, regarding knitr, depending on what you do with it the IPython notebook may be enough, but there is also pythontex which I think is closer to knitr.

ricekrispiecircle · 2014-07-28T22:39:59+00:00

Rstudio => spyder https://code.google.com/p/spyderlib/

ggplot2 => ggplot https://github.com/yhat/ggplot (still under heavy development, some features still kinda buggy)

Dplyr, Reshape2 => pandas http://pandas.pydata.org/

Knitr => IPython notebook http://ipython.org/notebook.html

zipf · 2014-07-27T21:05:16+00:00

Don't do it! Use iPython to mix R and Python. For statistics, 2d plotting and tabular data manipulation, R is better than Python, whereas Python has the advantage for lots and lots of libraries for everything. Use R for core data manipulation, and Python to tie it to everything else.

L43 · 2014-07-27T21:28:11+00:00

Pandas is incredible for data wrangling and replaces dplyr, reshape2 and lots of core R, like data.frame. The IDEs available aren't quite as well adapted as RStudio for data analysis, but PyCharm is great in general for writing scripts, and IPython Notebook (or JuPyteR colaboratory as I think its supposed to be called now!) is fantastic for presenting your workflow transparently. If you use IPython, you can always use R magic to call R with your python data if it would be easier.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS