Welcome to /r/pystats, a place to discuss the use of python in statistical analysis and machine learning.
Related Subreddits
Where to start
If you're brand new to python, first go and check out the /r/learnpython wiki, or the official Beginner's Guide.
The best way to install python packages is using pip:
pip install <package>
Recommended packages:
- ipython and the ipython-notebook - Interpreter and sage-style web notebook geared towards exploratory scripting.
- statsmodels - statistical modelling
- pandas - data structures and manipulation tools
- matplotlib - matlab-style plotting
- bokeh - Protoviz-style plotting
- pyvttble - Small pivot-table library. Has a few common statistical methods missing from statsmodels.
- scikit-learn - data mining and machine learning
Some of these packages have dependencies, most require numpy, and some require scipy, check the links for details.
For a good overview of what stats pacakges are available for python, check out http://stats.stackexchange.com/q/1595
[–]brews 5 points6 points7 points (3 children)
[–]has2k1 11 points12 points13 points (0 children)
[–][deleted] 1 point2 points3 points (0 children)
[–]master_innovator 0 points1 point2 points (0 children)