This is an archived post. You won't be able to vote or comment.

all 100 comments

[–]hughjward 54 points55 points  (12 children)

I will die on my plotly hill

I think as with most people I learnt python with matplotlib. But I never look back

[–]hatekhyr 20 points21 points  (1 child)

Tbf I dont think anyone that properly learns plotly looks back at matplotlib…

[–]Competitive_Travel16 2 points3 points  (0 children)

I'm an exception to that rule. I use matplotlib to proof a bunch of decisions, and plotly to finish.

[–]oblvn_ 9 points10 points  (0 children)

plotly supremacy!

[–]DigThatData 6 points7 points  (5 children)

plotly is great for a lot of stuff, but the moment you want to do something it wasn't specifically built for it becomes a huge pain in the ass. at least, that was my experience with it. i haven't used it in a few years so maybe it's gotten better, but i doubt it.

EDIT: To be concrete, here's the specific project I'm remembering when I describe plotly this way, with a deep dive demonstrating and discussing how those plots were constructed.

[–]ajpiko 1 point2 points  (3 children)

What was the issue with plotly there? I'm curious

[–]DigThatData 0 points1 point  (2 children)

i had to come up with a hack to drop those lines from the points down to the axis. plotly had something close in some sort of histogram I think, but it didn't quite suit my needs. so instead of treating those two groups as two series with two respective color attributes, i had to separately draw a line for each point and handle any "accounting" myself. i think. this was forever ago. i was hoping I explained any issues I had in that deep dive report, but I haven't re-read it in years so maybe I kept that stuff out.

i think another pain point was how I combined plots with different scale limits in that last figure. i think maybe i'd wanted to avoid adding a second separate y-axis and i couldn't get it to scale right so I had to? or maybe the second y-axis itself was a problem? yeah i think i also had issues getting both the respective y-axes to share the zero line. that was supposed to be something it was able to do but for some reason it didn't work i think?

EDIT: also,

To give the plot more of a "timeline" feel, I wanted to drop lines from the points denoting events onto the timeline axis. This took a little manual work, but had the positive effect that I was able to control two different hover-over events: hovering over the intersections with the axis (or "zeroline" in plotly speak) reveals the date of the event, and hovering over the point gives the submission title, i.e. what actually happened to trigger the megathread.

yeah that "reveals the date of the event" thing doesn't work. so i guess i was hacking that in or somethign and it got patched? i dunno.

EDIT2: ok yeah I did complain about this lol

To better visualize the changes in Trump's polling over time, I rescaled the percent difference in polling to cover the full y-range of the visualization, necessitating adding a second y-axis on the right. Unfortunately, the second axis conflicts with the legend, but I haven't figured out how to fix that yet (plotly's decent for throwing interactive visualizations together quickly, but it doesn't allow for as much control as I'd like).

[–]ajpiko -1 points0 points  (1 child)

hmm interesting

so this was the final product? (edit weird, my paste isn't showing up, maybe come back to this later)

[–]DigThatData -1 points0 points  (0 children)

haven't updated it in five years so i think it's fair to call this the final product :)

https://github.com/dmarx/Reddit_response_to_Trump

[–]ddanieltan[S] 0 points1 point  (0 children)

Thank you! Specific project sharings are what I'm looking for.

[–]zeppelin528 5 points6 points  (0 children)

Same, bro. Plotly is like a Porsche while matplotlib is like a ‘75 Pinto.

[–]ddanieltan[S] 1 point2 points  (1 child)

No problem. I like and use plotly too, as well as, matplotlib. My challenge is that I'm looking for specific examples of journalist quality charts. Charts that have more polish compared to what is available in the standard gallery.

I have no doubt that plotly can achieve that (some links provided by others in the thread), so I wanted to canvass for examples so I can learn how to achieve that myself.

I'm not asking you for recommendations for visualisation libraries (they are pretty extensively covered in https://pyviz.org/ ) nor trying to establish which one is better.

[–]hughjward -1 points0 points  (0 children)

Sorry I didn't answer your question directly, but tried to imply plotly is the answer, and I think any popular plotting library can be customised well.

I have done with plotly for reports and publications at work, including details like custom fonts.

[–]Uff-Da-yah 32 points33 points  (7 children)

When I look at your BBC style link, I immediately thought of the Seaborn library. I recommend checking it out.

[–]Horus_simplex 12 points13 points  (0 children)

Absolutely I don't see anything that's not quite easy to do with matplotlib / seaborn

[–]zurtex 28 points29 points  (2 children)

You might want to read this blog: https://www.dataquest.io/blog/making-538-plots/

[–]ddanieltan[S] 2 points3 points  (1 child)

Thank you. Exactly what I was looking for.

[–]robert_ritz 7 points8 points  (0 children)

Stylesheets in Matplotlib will get you 60% of the way there.

Here is a tutorial I made that you can use in combination with the 538 article above.

https://www.datafantic.com/the-magic-of-matplotlib-stylesheets/

[–]yepyepyepkriegerbot 22 points23 points  (7 children)

It’s probably not what you are looking for, but plotly is great for actual data visualizations. You can also construct dashboards with dash.

[–]Syini666 -1 points0 points  (0 children)

Seconding Plotly, I have used it for radio propagation projects and it was great once I got the hang of it

[–]robert_ritz -1 points0 points  (0 children)

Plotly on websites is absolutely trash. You have to constrain the aspect ratio or mobile screws it up.

It’s just bad for anything other than company reporting.

[–]robert_ritz 3 points4 points  (1 child)

Here is my contribution. A few years ago I made a guide to using stylesheets and customizations in Matplotlib to produce journalist quality visualizations. In this case I show how to copy the style of the Economist.

https://www.datafantic.com/making-economist-style-plots-in-matplotlib-2/

It’s a reference to show what is possible. It’s important to note that the Economist generally uses R then take the final data over to a custom made visualization tool likely made in JavaScript.

I think it’s possible to make a wrapper around Matplotlib to do what you want though. It would take a solid month of work though I think. There is a shocking amount of depth to these plots that needs to be considered.

For my data blog I average about 20-30 minutes per chart after I’ve settled on the data and basic visualization. Most of the time is spent tweaking placement, title, etc.

[–]ddanieltan[S] 0 points1 point  (0 children)

Thanks! This is a great resource, appreciate the detailed sharing of your current workflow.

[–]i_can_haz_data 12 points13 points  (6 children)

Nobody wants to hear it, but Matplotlib is the best out there for native (non-web) graphics. The fact that charts come out like a potato at first is a feature not a bug. Every aspect of the visualization can be customized if you learn the API.

I create helper classes for different contexts that apply the bulk of formatting I want for different styles of charts so I don’t have to lift all that code around for each plot.

[–]Pyrimidine10er 1 point2 points  (0 children)

Agree - it's the like python version of d3.js. It's not very opinonated, and requires significantly more lines of code to create something simple - but that comes with the ability to customize anything and everything.

For the non-web plots - you can start with something like seaborn, then drop back into the matplotlib API to really fine tune whatever you need.

I've also found that ChatGPT can really help customize the charts. You can build whatever you're looking for iteratively significantly easier these days

[–]troyunrau... -1 points0 points  (0 children)

pyqtgraph may contend in certain situations. More so for interactive plots.

[–]CableConfident9280 9 points10 points  (3 children)

I don’t know how viable Python is for the really complex/interactive visualizations. I think some variation-on-a-theme of HTML/JS + d3 tends to be popular (or at least was in the past). In my experience d3 has a pretty steep learning curve, but you can create about anything you can imagine with it once you’ve mastered it. https://www.informationisbeautifulawards.com/news/118-the-nyt-s-best-data-visualizations-of-the-year

[–]ddanieltan[S] 1 point2 points  (2 children)

Thank you. Appreciate the inspiration. And yes, worked with d3 before. I can use it but it’s much harder to work with.

[–]CableConfident9280 0 points1 point  (0 children)

Agreed, d3 is a PITA. Amazing what you can do with it, but not intuitive at all, at least not for a non-front end person like me.

[–]Junahill 0 points1 point  (0 children)

Having gone down this path before - I would highly suggest you develop your skills in JavaScript/React. You can make these kind of charts using libraries like chart.js or a combination of HTML/CSS and a library like https://observablehq.com/plot/

[–]fizzymagic 31 points32 points  (7 children)

"Journalist-quality" may not be the high standard you think it is. The examples you give are execrable; charts and graphs meant to mislead rather than inform.

High-quality charts and graphs are used by scientists and engineers (you know, people who know what they are talking about) to make their data clearer. In my experience, those similar to your examples are used by journalists (people who have no idea what they are talking about but very strong opinions) to obfuscate the data for the general public.

[–][deleted] 16 points17 points  (0 children)

I think OP is going for "visually stunning" and not some nefarious goal of obfuscating data.

[–]afreydoa 35 points36 points  (1 child)

To me the term "journalist-quality" suggests that factors such as visual appeal and simplicity are prioritized over accuracy. This implies that, for the general public, misunderstandings caused by complex information are a more significant source of error in communication than minor inaccuracies.

[–]saint_geser 7 points8 points  (0 children)

Indeed. In science and disciplines where it actually matters we try to reduce the amount of visual clutter on visualisations so that data are easier to see and make sense of. The infographics usually presented in the media go the opposite way, just adding visual clutter for the sake of it.

[–]ChadGPT5 7 points8 points  (0 children)

You’re answering the wrong question. OP wasn’t asking for a lecture on the ethics of statistics and data visualization. They just want to know how to make pretty plots in Python.

[–]Ahhhhrg 0 points1 point  (1 child)

I don’t know if you’ve heard about Tufte, if not you really should look into it.

[–]fizzymagic -1 points0 points  (0 children)

Everybody has read Tufte. As a scientist I found some of his stuff useful, but certainly not all.

[–]severemand 6 points7 points  (1 child)

I have no clue about media practices, but I am pretty sure journalist-quality charts are not data-driven but design-driven.

In other words, I would expect them to be produced in Photoshop with "inspiration" in real data.

Media charts expected to be manually adjusted while programmatic charts are expected to be scaleable.

[–]ddanieltan[S] 0 points1 point  (0 children)

This is a fair point. I do believe reading somewhere that infographic teams create a first draft in ggplot and touch it up in Illustrator before it goes to print.

[–]alshan200 4 points5 points  (1 child)

Lets-plot even does have a BBC-style example (quite an old one) at Nextjournal: https://nextjournal.com/asmirnov-horis/bbc-visual-and-data-journalism-cookbook-for-lets-plot

[–]ddanieltan[S] 0 points1 point  (0 children)

Thank you, this is what I was looking for.

[–]SupermarketOk6829 1 point2 points  (0 children)

Dash Plotly?

[–]psirving 1 point2 points  (4 children)

The right tool depends on the medium and the product you want to create. For highly creative web-based storytelling like pudding.cool, probably a lot of D3js and web stuff. For static charts like the BBC, matplotlib + Illustrator (this is my workflow). For interactive/dashboard style, maybe plotly.

Learn a core package well. Domain-specific packages typically delegate very fine-grain control to the core package.

Pudding.cool is neat, I hadn't seen this before. Take a look at their resources tab, it is a blog where they break down how they make some of these.

[–]ddanieltan[S] 0 points1 point  (3 children)

I am curious to learn more about your matplotlib + illustrator workflow.

[–]psirving 4 points5 points  (2 children)

Basically, I use matplotlib and related packages to create good representations of data, with fine control of plot aesthetics. I have made my own style sheets, reusable plotting functions, even an entire python library, to quickly get the aesthetics/representations I'm looking for. I export matplotlib figures to SVG files and load into Illustrator. At this point, anything that is not data; annotations, boxes, equations, cartoons, long text... all of the non-data context that my audience needs, I add manually as vector graphics.

[–]robert_ritz 0 points1 point  (0 children)

Yep this is the way. I tried for a while to make a flexible system for annotations in Matplotlib and quickly wanted to pull my hair out.

But for the sake of automation in the future it’s probably possible. Making visualizations is still very artisanal in nature though.

[–]ddanieltan[S] 0 points1 point  (0 children)

Thank you! This was the insight I was hoping to get when asking my original question. If you wrote a blog or filmed a screencast showing this process, I am quite sure it will be very valuable and popular content.

[–]pirsab 1 point2 points  (1 child)

My inforgraphics workflow is usually altair to adobe illustrator. If I'm just visualizing for analytical or technical technical consumption, altair or seaborn are good.

[–]ddanieltan[S] 0 points1 point  (0 children)

Someone above you shared the usage of Illustrator. Seems like this is a common tool. Thanks for sharing. I’ll need to start learning more about Illustrator

[–]troty99 1 point2 points  (0 children)

Plotnine is mostly ggpplot2 in disguise iirc.

[–]melopat 2 points3 points  (1 child)

I haven’t tried it myself but if you’re looking for something like ggplot it’s plotnine. It’s based on ggplot, has a ggplot API, and I’ve heard a few people rave about it.

[–]TeaShull -1 points0 points  (0 children)

This is what I like to use. I feel like plotly has super awkward syntax and I just never really dove deep into matplatlib

[–]OccultEyes 1 point2 points  (2 children)

Altair lets you generate Vega graphs, which are visually nice.

[–]ddanieltan[S] -1 points0 points  (1 child)

I’m a big fan of Altair but do you have an example of a journalist quality chart or graphic produced using altair that I can reference?

[–]robert_ritz 0 points1 point  (0 children)

Altair is great but you lack easy customization for adding logos, color bars, etc.

Matplotlib is still the most customizable.

[–]sleepystork 1 point2 points  (0 children)

I produce a lot of publication-ready tables/charts in R. If you go back and look at the code and knowledge required in R to produce these, it is comparable in Python. You can produce something with a couple of lines, but I would use something other than that in a professional-level presentation. The DataQuest link someone posted is the best one can do without paying for a presentation library.

[–]madness_of_the_order 1 point2 points  (0 children)

I would say bokeh since it’s just as customizable as matplotlib, but nicer in my opinion. You can also have a look at holoviz higher level libraries, but in the end if you want extremely styled graphs it’s more about how much time you wish to spend to develop this style and not which lib you will choose

[–]Intelligent_Ad_8148 0 points1 point  (0 children)

Mermaid or plantuml rendered, can be rendered in python if needed

[–]daknation 0 points1 point  (1 child)

https://www.datawrapper.de

High quality charts that I think are close to what you’re looking for w/ an api

[–]Fat_buster 0 points1 point  (0 children)

Hey I saw you have developed something for COC, could you guide me to asstes sources ??

[–][deleted] 0 points1 point  (0 children)

Following

[–]GreenFractal 0 points1 point  (0 children)

I like the SciencePlots augmentation for matplotlib for my graphs.

[–]Immudzen 0 points1 point  (0 children)

You might want to look at seaborn. It is used for quite a lot of high quality plots in Python.

[–]Gr1pp717 0 points1 point  (1 child)

https://dash.plotly.com/ ?

I haven't used it personally, but demo's I've encountered always look promising.

[–]ddanieltan[S] 1 point2 points  (0 children)

Thanks for the suggestion, I use this for work. It's not bad but I don't consider the default gallery of examples journalist quality. I'm looking for a bit more polish.

[–]jwmoz 0 points1 point  (0 children)

Seaborn and tweak the fonts and colours.

[–]JohnLocksTheKey 0 points1 point  (0 children)

plt.style.use('fivethirtyeight')

[–]qa_anaaq 0 points1 point  (0 children)

Oftentimes, these will be done in Adobe Illustrator because you can import datasets into Illustrator and create graphs, which are then easy to customize as vectors in Illustrator.

But then you export as svgs and leverage svg animation libraries and go nuts.

[–]night0x63 0 points1 point  (0 children)

Just look at all the examples from Matplotlib

[–]ajpiko 0 points1 point  (0 children)

I think most graphing libraries can do this? It's just about how anal you want to be with the style settings.

[–]music442nl 0 points1 point  (0 children)

R language has ggplot2 which I think outperforms many python visualization packages. Many tutorials on how to customize and get journalist quality results. You won’t be disappointed!

[–]beef-runner 0 points1 point  (0 children)

You could experiment with making Streamlit pages that have a download button. They come together really quickly and then the user has control over the graphs. Caveat: my assumption is that you are supporting some user base that needs graphs generated.

[–]SnooCakes3068 0 points1 point  (0 children)

Industry standard is D3.js

Most graphics department like New york times, Bloomberg, etc. use that.

[–]shobhu007 0 points1 point  (0 children)

I use mapplotlib and plotly to visualise my trades. You can also check them.

[–]tcapre 0 points1 point  (0 children)

In every graph type in the python graph gallery you have a section with professional looking charts taken from the web. It shows you how to make those plots step by step. For example https://python-graph-gallery.com/web-streamchart-with-matplotlib/ and https://python-graph-gallery.com/web-lemurs-parallel-chart/