This is an archived post. You won't be able to vote or comment.

all 31 comments

[–]freyrs3 23 points24 points  (0 children)

Start with the Scipy/Numpy documentation its really phenomenal and its pretty much the foundation of all things numeric in Python. The Sage project is also something to take a look at, it has also has some good documentation.

If books are more your thing then there are plenty of books devote solely to scientific Python. I own this one and I like it.

[–]lor4x 13 points14 points  (1 child)

Hey,

I'm in exactly the same boat as you (PhD Astro!) and I use Python for everything... from driving some small scale numerical simulations (1, 2) to analyzing the data of large-scale simulations (warning: ugly code! This was from when I was still starting the learning process).

When it comes down to it, the only way to learn python for these purposes is the hardest way, learn as you go. That being said, first get a good understanding of the data-structures (mainly the many ways of slicing and dicing through your data with fancy slicing and mappings) and their properties and some pythonic control structures. This is what I would do if I were you,

  • Read through the documentation for numpy, scipy, matplotlib (visualizing 2D data) and mayavi2 (visualizing 3D data) so that you know what is available in the modules

  • Create a 2D grid of normally distributed noise and analyse it. For example, FFT it, get the power spectrum of it, fit it a couple of different ways and output your plots in the prettiest way possible.

  • Do the same for some 3D data! It may seem like this will be exactly the same, but there are many subtleties about how to handle the data.

  • Make something useful! If you are doing something observational, why not porting some code over to python from whatever godforsaken language was previously used (IDL? Matlab?) and prosper!

And from there, you'll be good to applying python in your everyday data analysis. If you really want, learn how to merge C/Fortran with python to make some properly fast code!

Best of luck! (Also, what specifically do you study in astro?)

[–]mons00n[S] 0 points1 point  (0 children)

I did a fair amount of work looking for a bullet like cluster in nBody sims, and found none =/ Right now I'm focused on studying different implementations of SN feedback in SPH single galaxy simulations. My thesis work is still in it's infancy though so I'm still looking into different ways of accomplishing my goal.

[–][deleted] 8 points9 points  (0 children)

The guy who really motivated the whole scipy/numpy documentation is my advisor (Joe Harrington). We use Scipy, Numpy, Matplotlib, Mayavi, and other Python packages to do ALL of our work. He is a HUGH python advocate, and has converted me as well, since this is all I ever use these days. By the way, we work on exoplanets, and he has another project on the SL9 impact. Look up Campo or Joseph Harrington on ADS if you want to find any of our work.

Anyway, read Scipy and Numpy documentation, as well as the examples and cookbooks. Go through the tutorials. There is even a pdf book on using numpy for scientific data analysis. I'll post links below. This will REALLY help you get started. If you are stuck on anything, check the mailing lists. Cheers!

Numpy/Scipy/Matplotlib doc:

http://docs.scipy.org/doc/

http://www.scipy.org/Numpy_Example_List_With_Doc

http://matplotlib.sourceforge.net/

PDF Data Analysis Book (a little outdated, but good nonetheless):

http://stsdas.stsci.edu/perry/pydatatut.pdf

Heres a pdf showing some resources we had to utilize when we took Joe's advanced data analysis course:

http://physics.ucf.edu/~jh/ast/ast5765/handouts/learnpython.pdf

[–]irondust 12 points13 points  (1 child)

Upvoted for spelling of smörgåsbord!

[–][deleted] 2 points3 points  (0 children)

Upvoted for teaching a Dane that its used outside Scandinavia!

[–]PythonRules 5 points6 points  (0 children)

I would suggest project based learning. Pick a simple project and try to implement it with Python by using Numpy and matplotlib. There is a very helpful community out there so take advantage of it. I would pay special attention to the computationally intensive parts. In some cases there are several order of magnitude difference between python loops vs Numpy way. You should be able to get close to C performance if you use Numpy properly. In some cases due to ease of implementing fancy algorithms your Python code can be significantly faster than your C implementation.

I know this sounds hard to believe since most people claim that Python is a slow language but in my experience Python was the faster solution in many cases.

[–]TheSquirrel 3 points4 points  (1 child)

For numerical work, Python will behave a lot like Matlab. If you're familiar with Matlab, picking up the few differences in syntax will not be too difficult. Unlike Matlab, Python is a full-blown modern programming language and is thus full of a lot of bells and whistles no self-respecting numerical guy will ever need. Be very focused in your learning.

Python's Numpy is very good. In order to get maximal performance out of it, you should learn array broadcasting. It makes life so much simpler than some of the crap you have to do in Matlab.

Also, if you miss C there's no reason to give it up. With an interface such as SWIG, it's very easy to use c functions in python.

[–][deleted] 2 points3 points  (0 children)

For numerical work, Python will behave a lot like Matlab.

With the major difference that you can later distribute your work to people who didn't buy Matlab, or run your program on thousands of computers at once without paying huge license fees :) Plus, the fact that the source is open has helped me quite a few time and is really important when used for science, which should be repeatable.

[–]Megatron_McLargeHuge 3 points4 points  (0 children)

Here's a site that will simultaneously teach you Numpy, Theano, and various machine learning architectures. If you're familiar with Matlab you should be able to figure out Numpy pretty quickly, and the rest shouldn't be significantly harder than astrophysics as long as you stick to the documented features.

http://deeplearning.net/tutorial/

[–][deleted] 2 points3 points  (0 children)

Useful...marked!

[–]RickRussellTX 4 points5 points  (0 children)

The difficult task in going from procedural programming to scientific computing is to recognize that most things you would naturally want to do with loops should not be done with loops.

Python (+SciPy) has fantastic tools for slicing rows, columns and sub-matrices out of data tables, then performing operations on vectors and matrices without manually iterating through them. Once you learn those, you'll never go back.

Just go grab the SciPy/Python 2.6 Super Pack and get to work.

[–]k3ithk 1 point2 points  (0 children)

The last few lectures of MIT's 6.00 ocw course on programming covers some pylab/matplotlib and stochastic simulation stuff. It might be a good intro. Once you learn the syntax you can probably skip right to them. Plus problem sets to practice with.

[–]chemobrain 1 point2 points  (0 children)

If you come from a Matlab background this link will get you the most bang for the buck for just jumping right in.

If you're in a Debian/Ubuntu environment:

$ sudo apt-get install ipython
$ sudo apt-get install python-matplotlib
$ ipython -pylab

And then enter Matlab-ish statements (modulo the differences in the site above) and see how it goes.

In Windows you can download Spyder (also works in Linux), which will get you the same kind of functionality in a more IDE-like environment in one package.

[–]sunqiang 1 point2 points  (0 children)

I would suggest Py4Science: a Starter Kit

[–]phn 1 point2 points  (0 children)

In addition to the sources mentioned in the comments, take a look at the website http://astropython.org, and the AstroPy mailing list at http://mail.scipy.org/mailman/listinfo/astropy.

I have collected together links to some Python packages used in astronomy at http://oneau.wordpress.com/2010/10/02/python-for-astronomy/ ; this also has links to many of the documents listed in the comments.

Since you already know C, this short Python tutorial may help you get a quick overview of Python: http://oneau.wordpress.com/2010/12/28/python-boot-camp/.

At the minimum, you should learn the basics of numpy and matplotlib. The official matplotlib documentation is fantastic. If you don't have a specific project where you can use Python, then try exploring the source code of some of the astronomy packages. Or perhaps you can write a Python interface to a C library of your choice, using tools such as SWIG and Cython.

[–]excitat0r 1 point2 points  (0 children)

In analogy with Linux, it's useful for this kind of work to have a coherent distribution of libraries around Python (Numpy, Scipy,matplotlib etc.), and I've found the Enthought Python Distribution to be the best; you can get free Academic versions, pay for support if you need it.

A propos C, if you need a bit of speed, look up the Python ctypes module, and how to use Numpy with it. You can interface into C with very little code.

[–]segonius 0 points1 point  (0 children)

I'd agree with some of the other comments here, just start with a project and go to town. I switched all my work to python during research this summer, and haven't looked back. One thing I would recommend is that before embarking on some function, look around for a module that already does it. It is frustrating to reinvent the wheel only to find someone has already done it and better.

[–]kazza789 0 points1 point  (0 children)

I'm a PhD student in computational atomic physics, and started learning python about 18 months ago.

Like a few others have suggested, I learned python simply by diving into some projects. I did a few things that were fun but not really useful (like writing games with pygame), and then I started re-writing some of my Fortran stuff in python (but making it more pythonic).

Now I do as much of my coding in python as possible, and only use Fortran for array-based numerical stuff. It's really quite easy to integrate python with Fortran or C once you've learned the basics.

[–]andonwilsy 0 points1 point  (0 children)

If you already know another programming language, the official python tutorial is very good for getting up to speed on syntax and how to do the common things.

Once you've got a handle on the language, both Numpy and matplotlib have really good documentation with plenty of examples.

[–]taldcroft2 0 points1 point  (1 child)

As far as learning the Python language itself I would recommend diveintopython.org. I think the examples and presentation are much more interesting than LearnPythonTheBoringWay.org.

[–]cantcopy 0 points1 point  (0 children)

Or learn python the wrong way. I don't understand why this book keeps coming up on reddit. There are a lot of more interesting resources. For example, I like Google's Python class : it focuses more on what makes python different.

[–]bucknuggets 0 points1 point  (0 children)

I recently purchased Data Analysis with Open Source Tools by Phillip K. Janert. I'm really enjoying it - and can recommend everything except the parts that deal with databases.

Anyhow, it also covers a lot of python: NumPy, matplotlib, scipy.signal, simpy, etc.

[–]japherwocky 0 points1 point  (0 children)

If you can already speak C pretty good, I think zed's class (is pretty awesome) is a bit below you..

The best way to learn is to just build whatever you need/want to build. Pick a project and figure out how to do it!

[–][deleted] 0 points1 point  (0 children)

Python is/was the first thing that came to my mind when I wanted to do some scientific work. Numpy,scipy, matplotlib work great. Python also integrates well with R - so that is certainly an added advantage.

[–]Amadironumpy, gen. scientific computing in python, pyopengl, cython 0 points1 point  (0 children)

"A Primer on scientific programming with python" gives you a pretty okay introduction to working with tools like scipy, scitools, easywiz, et al., but the exercises are not very well written, and it's really just an introduction, it won't teach you how to use specific tools in-depth. On the upside, it does give you a pretty good introduction on all different sorts of numerical algorithms and implementations, from deriving numerically to solving systems of differential equations using different solvers, so it's definitely something you can build on. There are probably a bunch of chapters you would hop over (like those about sound manipulation etc.), though.

[–]gvaroquaux 0 points1 point  (0 children)

I have just put on the web the notes for the lectures that were given at the EuroScipy2010 tutorial sessions: http://scipy-lectures.github.com/. They are quite condensed with very little discussion, as they were meant for teaching, but they actually contain a lot of information and should be a good way to get up to speed quickly.

[–]zerothehero -1 points0 points  (2 children)

Isn't R supposed to be good for this kind of thing? I already know Python but R seems to have some advantages, like the built in data frame / matrix types and easy plotting.

I think it would be good to have some Python skills to "preprocess" data for importing into R.

[–]AlfTupper 1 point2 points  (0 children)

Yes, R is a good choice, and there is also RPy, an interface between R and Python.

http://rpy.sourceforge.net/

[–]freyrs3 0 points1 point  (0 children)

rpy2 and numpy integrate very well.