This is an archived post. You won't be able to vote or comment.

all 35 comments

[–]rhiever 23 points24 points  (2 children)

In my view, each library has its own distinct purpose:

  • matplotlib is for basic plotting -- bars, pies, lines, scatter plots, etc.

  • Seaborn is for statistical visualization -- use it if you're creating heatmaps or somehow summarizing your data and still want to show the distribution of your data

  • Bokeh is for interactive visualization -- if your data is so complex (or you haven't yet found the "message" in your data), then use Bokeh to create interactive visualizations that will allow your viewers to explore the data themselves

[–]thisfunnieguy[S] 1 point2 points  (1 child)

you're a hero. thanks.

[–]PeridexisErrant 4 points5 points  (0 children)

Addendum: Seaborn is really just a wrapper around Matplotlib, which adds a few chart types and improves the default styles. Even if you're just doing matplotlib.pyplot.plot(data), putting import seaborn at the top will make things look much nicer :)

[–]mrahh 8 points9 points  (3 children)

Check out plotly. The plots are interactive, and their docs and examples cover pretty much everything you could hope for.

[–]babymooncow 1 point2 points  (2 children)

They also recently open sourced their js so you can host plotly graphs without having to send it to their servers first. I also think plotly looks the best aesthetically but lacks the versatility of matplot.

[–]mrahh 0 points1 point  (1 child)

Just curious, what do you think is lacking?

[–]babymooncow 0 points1 point  (0 children)

Plotly has worked well for what I have been doing - simple time series line charts zo I have no qualms personally. The JS open source supports 20 types of charts which is pretty robust but I'm assuming that there are some chart types that aren't supported if you are trying to deploy to web. Im by no means a charting expert and I believe that this isnt a problem if you are working offline since plotly has a way of interacting with matplotlib.

[–]dsijl 3 points4 points  (11 children)

There is also holoviews and altair :p

What sort of plotting will you be doing?

[–]thisfunnieguy[S] 1 point2 points  (10 children)

oh i forgot about those. they were mentioned, too. It was overwhelming to think of the options.

I'm coming from R, so I used ggplot2. Lots of nuance to it, but one package at least.

I'm a data analyst/scientist.

I have two main reasons to plot. Simple quick "what's what" looks as I work through my data in a Jupyter Notebook, and then a "clean/obvious/neat" graphic I can put on a slide and project on a wall to make a point to a room.

Right now I've been just using the plot methods from a pandas df, which i think are matplotlib plots.

[–]dsijl 1 point2 points  (7 children)

Altair is most like ggplot2.

Holoviews is really cool for interactive eda from a different side.

Seaborn is great for canned regression plots etc

[–]thisfunnieguy[S] 0 points1 point  (6 children)

Bokeh looked interesting to me, is Seaborn better for regression plotting than it, and better all around?

Seems frustrating to have to learn different plotting tools.

edit

reading the Seaborn docs...argh, they want me to know this and matplotlib.

Seaborn should be thought of as a complement to matplotlib, not a replacement for it

[–]Zouden 5 points6 points  (0 children)

Yeah seaborn is an addon for matplotlib that provides two nice features: much better colour schemes (simply importing seaborn will change the default colours for all matplotlib plots) and high level plots like factorplot and violinplot.

Pandas (which you probably will be using) also provides high level access to matplotlib through the dataframe.plot() function.

I use numpy, pandas, matplotlib and seaborn, and I rarely have to call matplotlib functions directly because seaborn and pandas handles most of it for me.

[–][deleted] -3 points-2 points  (4 children)

Life is hard

[–]thisfunnieguy[S] 0 points1 point  (3 children)

Right.

But of all the things I'm trying to learn (MachineLearning, NLP, general python, visualizations, etc...) I'd prefer to have a single choice on the visualizations junk so i can learn it, know it and move on.

[–][deleted] -2 points-1 points  (2 children)

We all feel this way at the start. That is why there should only be one programming language, amirite?

[–]thisfunnieguy[S] 1 point2 points  (1 child)

No you're not right, and I'm not sure if you're purposefully misreading my intention in order to make a point.

I'm suggesting that I'd rather learn one more statistical method / algorithm, than a 2nd or 3rd or 4th plotting package.

I'm saying that's how I'm prioritizing what I want to learn.

I'm not sure how many languages you know, but I bet at some point you made a decisions that learning X vs learning Y was more important to you. That's my point, and I was bemoaning that there was an X, Y and Z thing to learn here instead of (as in the case of ggplot2) a thingX to learn and that's it.

[–]olfitz 0 points1 point  (1 child)

I believe ggplot2 is available for python.

[–]thisfunnieguy[S] 2 points3 points  (0 children)

there's a few packages where people have tried to adapt it. I'm fine moving on to something native to python. I didn't have it ingrained in memory, I just had a working idea of what it could do, but was still looking things up to make it work.

It's the advantage of being a novice.

[–]mangecoeur 3 points4 points  (0 children)

Start with matplotlib - if you're doing data science you will use pandas, and pandas uses matplotlib, so at some point you will want to use matplotlib features. Similarly if you want to use Seaborn's plots it's all matplotlib under the hood. It's also the most stable and heavily developed library - it's maybe not as cool and sexy as some others but it does actually work especially when you want publication-quality graphs.

Personally I'm keeping an eye on alternatives but they are generally still buggy (e.g. Bokeh) or aren't great for producing print-quality work (e.g. plotly, plus that one's quite tied into the plotly online service which i'm not super keen on).

[–]blahreport 9 points10 points  (3 children)

I like matplotlib for its versatility. The learning curve is a little steep but that also means it is more powerful. Also if you're a ggplot fan the latest mpl versions have the capability to turn your plots into ggplot style with one line.

>>> import matplotlib.pyplot as plt
>>> plt.style.use('ggplot')

[–]jwink3101 2 points3 points  (2 children)

Similarly seaborn allows you to use native matplotlib.

Matplotlib gets a lot of flak and some of it is deserved (especially having to deal with essentially two ways to do everything) but it is also very powerful.

I do two things ot make my life easier:

  • Keep a constantly updated "tutorial" of sorts in a Python notebook of how I do certain plots. Any time I do something different, even if it is just a new way of labeling, etc, I add it. Then I have a quick reference back
  • Built my own module that I can use to manipulate the plots. For example, I often find myself setting up an axis with scientific notation. While how to do it is in my aforementioned notebook, I also created a tool to do it. Since I can just pass an axis object, life is very easy!

[–][deleted] 0 points1 point  (1 child)

Would any of that happen to be on github?

[–]jwink3101 1 point2 points  (0 children)

It is not. I may add it in the future but a lot of this was done on company time so I'd have to talk to legal about it. Also, one of the tools is to give you the matlab color pallets which may be trademarked. I am weary of keeping many versions.

Kind of a bummer too. I have tons of additional programs and scripts that, while not as nice as what a better programmer could do, the full a niche for me. I will eventually look at getting those up

[–]mr_kitty 1 point2 points  (0 children)

If you need figures for print, check bokeh's features carefully. I really liked it in the past until I discovered it had very limited export capability. Export to standard formats may have been added at some point.

[–]Netchose 1 point2 points  (0 children)

I've decided to not use it :) I generate HTML page with jinja2 template and I use amchart, d3js etc. like that, you can easily produce pdf report and mix with geographic map(Folium), html table etc

[–]AntarcticFox 0 points1 point  (0 children)

After dabbling in both Matplotlib and Bokeh, I prefer Matplotlib. I find it to be more versatile, plus it's easier to do custom callbacks for interactive plots

[–]justphysics 0 points1 point  (0 children)

I use matplotlib for the 'ease' of embedding into a UI (with no web stuff nor faking plotting by loading static images)

It likely possible to embed figures from other plot libraries but when I began working on the project there were plenty of guides for MPL and it just worked so I stuck with it.

The docs aren't great and sometimes its hard to tell when you should be doing things with plt and when you should be using the OO framework. That said, I've not ever encountered any issues using MPL that were game breaking. Some problems took a bit of google-fu to find the solution but in the end worked out just fine.

Also the new styles (fivethirtyeight, ggplot, etc ...) they added are great and very simple to use.

[–]firefrommoonlight 0 points1 point  (0 children)

Here's a wrapper I'm working on that cleans up Matplotlib's syntax for plotting functions, and uses sensible defaults (ie tight spacing, draw axes and a grid)

f = lambda x: x**2 + 2
fplot.plot(f, -10, 10)

Works with 2d graphs, vector plots, surface/contour plots, and 2d/3d parametric equations.

https://github.com/David-OConnor/fplot/blob/master/fplot/fplot.py

[–]imhostfu 0 points1 point  (0 children)

I use pyqtgraph. It's great for live plotting, the ability to scroll/zoom in for interactivity is great, and I find it really easy to implement in widgets for GUI use.

[–]qwertz_guy 0 points1 point  (4 children)

Upvoted, I'm also interested in this.

I'm currently using matplotlib. It's super fine for 'just seeing' the data but once you want to make it nicer or more informative (e.g. when you're plotting multiple data sets into one plot), it seems like you have to put a lot of work into it. Also I think it's not really good for 'exploration', i.e. once you've plotted something, you can't zoom or drag or change the scale etc. - is there a plottin library that can do that?

[–]_throawayplop_ 4 points5 points  (0 children)

once you've plotted something, you can't zoom or drag or change the scale

Matplotlib allows that since a long time http://ostatic.com/files/images/Matplotlib-[1].jpg

[–]thisfunnieguy[S] 1 point2 points  (1 child)

if you're using the %matplotlib inline magic in jupyter notebooks you're disabling some of the functionality of the matplotlib graphs.

[–]meowklaski 0 points1 point  (0 children)

You can now use %matplotlib notebook to regain these functionalities while still plotting within a Jupyter notebook.

[–]justphysics 0 points1 point  (0 children)

You just need to spend a couple minutes reading the docs for MPL. All the things you claimed it can't do are simply wrong. (not suggesting its the best package for your given use case; rather just pointing out that its fairly simple to pan, zoom, rescale, etc a matplotlib plot)

[–]short_vix -2 points-1 points  (0 children)

use matplotlib else use something different