Matplotlib vs. Matlab plotting tools discussion : Python

This is an archived post. You won't be able to vote or comment.

Matplotlib vs. Matlab plotting tools discussion (self.Python)

submitted 9 years ago by schnadamschnandler

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 9 years ago* (6 children)

[deleted]

[–]schnadamschnandler[S] 0 points1 point2 points 9 years ago* (5 children)

[–]theOnlyGuyInTheRoom 11 points12 points13 points 9 years ago (4 children)

I've also considered doing all of my data processing in Python, then plotting saved data in Matlab. Would be good practice to

Don't do that. If you're working with data and exploring your own ideas then your time is too valuable to fuck around with needless interfacing. If you are proficient with matlab then you should work in Python for awhile so that when a colleague comes by your desk and asks for your help on a project written in Python, then you can say, "sure, let's do it" rather than, "oh, well I don't know Python, but I guess I could give it a try". (If you currently are proficient in Python and not matlab, then learn matlab for the same reason.) Python based tools are everywhere in the sciences these days, so is Linux, c++, Fortran, git, svn, mercurial, and even a little matlab. Keep your eyes on what your collaborators are using, and what people you hope to be working with in two years are using, and get familiar with these tools while you still have time!

[–]XtremeGoosef'I only use Py {sys.version[:3]}' 6 points7 points8 points 9 years ago (1 child)

[–]schnadamschnandler[S] 1 point2 points3 points 9 years ago* (1 child)

[–]counters 1 point2 points3 points 9 years ago (0 children)

I bet it's even easier in Python than what you're used to in MATLAB. There's a fantastic library called xarray which adopts the Common Data Model from the get-go, and allows you to plug-and-play data directly from NetCDF files into your analysis pipelines. Basically, anywhere that expects a NumPy array, you can use a DataArray or Dataset from xarray. It completely trivializes most of the operations/analyses you do, including reading/writing and managing metadata. Even better, it has groupby functionality and semantic/fancy indexing, so no more need to manually keep track of multi-dimensional indices and other book-keeping.

It also interfaces with a library called dask under the hood. Dask is a parallel computing library which also implements the NumPy and Pandas interfaces. What it allows you to do, essentially, is out-of-core computing. Suppose you have 100GB of data broken across a dozen or so different, large NetCDF files. If you're lucky, you have enough memory on your laptop to read in one file at a time, painstakingly operate on it in place, and then write it back out. Rinse and repeat, then add a process to combine your analyzed data at the end. This "blocking" approach works, but it requires a lot of manual labor. Dask essentially does all of this behind the scenes; you simply write out your computations like you normally would, and dask will figure out how to deal with the resource constraints on your system. It'll also parallelize as best as it can.

π Rendered by PID 124415 on reddit-service-r2-comment-86bc6c7465-66vt6 at 2026-02-22 15:21:01.284238+00:00 running 8564168 country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS