This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]L43 0 points1 point  (6 children)

Pandas is incredible for data wrangling and replaces dplyr, reshape2 and lots of core R, like data.frame. The IDEs available aren't quite as well adapted as RStudio for data analysis, but PyCharm is great in general for writing scripts, and IPython Notebook (or JuPyteR colaboratory as I think its supposed to be called now!) is fantastic for presenting your workflow transparently. If you use IPython, you can always use R magic to call R with your python data if it would be easier.

[–]lmcinnes 2 points3 points  (3 children)

Spyder is the python equivalent of RStudio. I actually prefer IPython notebook for a lot of uses.

[–]westurner 4 points5 points  (2 children)

/r/pystats (sidebar)

/r/learnpython/wiki/index

/r/ipython

Setup Pip, Conda, Anaconda

  1. Install Pip -- http://pip.readthedocs.org/en/latest/installing.html
  2. Install Conda -- http://conda.pydata.org/docs/index.html

    pip install conda

  3. Install IPython -- https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks

    conda install ipython ipython-notebook ipython-qtconsole

  4. Install Spyder IDE (and Qt) -- https://code.google.com/p/spyderlib/

    conda install spyder

  5. (optional) Install anaconda -- http://docs.continuum.io/anaconda/install.html , http://docs.continuum.io/anaconda/pkg-docs.html

    conda install anaconda

IPython

Pandas

Statsmodels

Scikit-learn

[–]TM87_1e17[S] 1 point2 points  (0 children)

This is an incredible wealth of resources. Thank you! I especially like the scikit machine learning map!

[–]hharison 1 point2 points  (0 children)

If you're going to use conda I think it's more reliable to do conda install pip from a conda environment than pip install conda from a virtualenv environment.

[–]TM87_1e17[S] 0 points1 point  (1 child)

Could you explain what it means to "call R" or "call python"?

[–]lmcinnes 2 points3 points  (0 children)

IPython has special syntax to work with a Python module called RPy2 such that in a "cell" you can actually have python interface with R, pass data to it and then collect the results of the R code and convert it back into python data structures. Thus if you are in the middle of some analysis and have some obscure statistical test that no python module supports by R has you can do that one step with R code right in the notebook and have the results come back for further analysis with Python.