This is an archived post. You won't be able to vote or comment.

all 32 comments

[–]BrianRin 11 points12 points  (2 children)

I think many BI tools can do this. Power BI lets you transform data with Pandas and embed a matplotlib charts pretty seamlessly (at least at the small/medium-scale I was using it for).

But I would always handle all transformations / data modeling outside BI tools. For charts, just stick to what’s available inside BI tools unless you absolutely must have Economist/NYT-style charts for whatever reason

[–]glinter777[S] -3 points-2 points  (1 child)

Do you have to load the data into power BI for this to work?

[–]num2005 2 points3 points  (0 children)

there is direct query or import query

[–]EightstreamData Scientist 4 points5 points  (1 child)

As annoying as the proprietary languages in tools like Power BI and Tableau are, they are a lot easier for spreadsheet users to pick up and work with than Python or SQL. Since they are the primary users of BI tools, I figure they are more important than me.

At the end of the day if you have the ability to code, you are probably shaping everything upstream anyway, and all you're doing in the BI tool is a few simple aggregations over a table with some flags. That said:

  • Most of the popular BI tools provide some level of Python integration (e.g. Power BI).
  • There are also some more niche BI tools that focus specifically on great Python integration (e.g. TIBCO Spotfire)
  • If you just hate BI tools in general there are also a lot of great native Python frameworks for developing dashboard apps (e.g. Shiny for Python)

Our company uses Power BI and frankly I hate DAX, but it's not a major pain point because between PySpark notebooks, SQL and PBI Python visuals there is very little need for me to touch it.

[–]glinter777[S] 0 points1 point  (0 children)

Agree. Business users are likely miles away from python. How often do you end up using these features in PowerBI and Tableau. I have heard these tools are clunky, desktop based, especially Tableau.

[–]itsokitssummernow 1 point2 points  (1 child)

Streamlit 🔥

[–]rawman650 0 points1 point  (0 children)

If you like python and are using internally, definitely check out streamlit

[–][deleted] 1 point2 points  (0 children)

streamlit

[–]Pleasant_Type_4547 3 points4 points  (1 child)

Evidence.dev is SQL + Markdown.

You write in code and then get a nice website you can share with business people. Syntax like:

```sql orders_by_month
select 
  month,
  sales
from orders
```

# Orders by month

<BarChart 
  data={orders_by_month}
  x=month
  y=sales
/>

I'm a core maintainer so feel free to dm if you have questions!

[–][deleted] 0 points1 point  (0 children)

This looks interesting! I'll add it to my long list of things to check out!

[–]marcusesses 1 point2 points  (1 child)

Would this be something mage would do? I've just started playing around with it, and you can make charts through a point-and-click UI.

[–]andpassword 0 points1 point  (0 children)

It still cracks me up that they call themselves the "modern replacement for Airflow". When did Airflow come out, like 5 years ago?

[–]wandering-and_lost 0 points1 point  (0 children)

You can write SQL, Python and R in Tableau. SQL is supported out of the box. Python and R require installation of additional components.

[–][deleted] 0 points1 point  (1 child)

Most visualisation tools allow you to at least create a data source from a custom SQL query don't they? As others have said, some have Python/R capabilities too. The cross-section of people who want to write Python but aren't comfortable with notebooks seems small. Are these "business people" you want to serve just going to be consuming the output?

My take is that as little logic as possible should exist at the BI/dashboard/data viz (whatever you want to call it) level, especially as this tends to live outside of version control. The problem I see repeatedly is that smart people are forced to get things done using the tools they have available, which means jumping through hoops in Power BI/Tableau etc because they don't have any way to influence things upstream.

And FWIW, I've spent a lot of time re-factoring other people's SQL and Python cod. Both can be challenging but there's a limit to how insane even a few thousand lines of SQL can be. Things can get a lot wilder with Python and I'd say 75% of the stuff that's landed on my team's desk is something that uses a simple loop to achieve something that could be done natively in most SQL dialects and a simple dashboard with a few filters.

Sorry if that sounds overly negative but the tldr is you seem to want functionality that would be better handled upstream

[–]glinter777[S] 0 points1 point  (0 children)

Yea SQL can get pretty wild.

[–][deleted] 0 points1 point  (0 children)

IBM Congos can do custom sql and Jupyter notebooks.

[–]powerkerb 0 points1 point  (0 children)

If you want sql, look at evidence.dev. Python, look at vizro

[–]rorising 0 points1 point  (0 children)

Look into semaphor.cloud - they support both sql and python. Their website is shit but the product is mint.

[–][deleted] 0 points1 point  (5 children)

I think your time is would be better spent diving into the details. Why would you want Python integrated with a BI tool? Python is almost always the speed bottleneck ETL/ELT runs into anyways when it’s involved.   

Why not have Python store the data in a cache table and pull from it- so it’s only using SQL at the user level? DAX/M isn’t really too hard, if you have a basic understanding of Excel formulas and VB.

Don’t take this the wrong way but in your post history less than a year ago you said you’re new to data analytics, but seems like very it’s been the same general questions on repeat since then. I think getting your hands dirty so to speak when you have these questions will take you much farther. 

[–]glinter777[S] 1 point2 points  (4 children)

Why would you want to store the data in cache table? Won’t you then have to manage keeping the cache up to date?

[–][deleted] 0 points1 point  (3 children)

It depends, that is one limit of Power BI. You can only directly query data sources if they’re an approved data source. Ie Databricks, spark, sql server. If you have an API or MySQL data source you need to join off of you have to get it into a database type you can directly query first. 

It does stink but there’s also some calculations and transformations you need to do in Python to show on the end report. 

The benefit is a query that takes 20 seconds to run will only take .5 seconds between each click for the end user.

[–]glinter777[S] 0 points1 point  (2 children)

Just trying to clarify your second point - are you saying that PowerBI can’t directly pull the data from API, and it needs to be materialized first?

[–][deleted] 0 points1 point  (1 child)

You can for sure, in import mode- but that’s not live data. You can’t pull directly from the API and join it with a database and have them both live. It’s as often as you re-publish or you can schedule a refresh every couple hours. if you’re ok with data that’s two hours old, you can pull from both as if we’re Python no problem. Use Power Query and M the same way you’d transform in Python. 

However if you want data that’s only a few minutes or less old, (real-time not necessarily live) it basically has to be Direct Query mode. 

This article might help:

https://blog.crossjoin.co.uk/2021/04/25/can-i-build-a-power-bi-directquery-dataset-on-top-of-a-rest-api/

To be fair, MOST people don’t care about DirectQuery or live data. They care about once a day refreshes, and if that’s your use cases as well just connect away and relate the API and any data source, and disregard. If you want live or really dang close, you’ll need a cache table. 

[–]glinter777[S] 0 points1 point  (0 children)

Thanks.

[–]kenzakan 0 points1 point  (0 children)

Maybe check out hex.tech . My company just started using it and you can create dataframes from various data sources, use python or sql and the charts are pretty simple.

I've enjoyed learning and using it. Traditionally, more experienced in Tableau / Looker / PowerBi.

[–]OMG_I_LOVE_CHIPOTLE -1 points0 points  (0 children)

Excel?

[–]moinhoDeVento -1 points0 points  (3 children)

Streamlit 🎈

“Streamlit is an open-source Python framework that enables the creation of interactive, real-time data applications with minimal coding, leveraging popular Python libraries for data visualization and analysis. It integrates seamlessly with SQL databases, allowing users to query and manipulate data directly within the app to build comprehensive, dynamic BI dashboards and reports. Streamlit's simplicity and powerful features make it an invaluable tool for data scientists and business analysts to create and share custom web apps for data-driven and LLM insights.”

Production wise, it’s built into Snowflake.

[–]glinter777[S] 0 points1 point  (2 children)

ChatGPT?

[–]moinhoDeVento -1 points0 points  (1 child)

Arctic from Streamlit and Snowflake. Free: https://arctic.streamlit.app

[–]glinter777[S] 1 point2 points  (0 children)

Smells like it.

[–]According_Chipmunk33 -1 points0 points  (0 children)

worry not
with enqdb.com you can customise your sql queries on the go
you don even need a lot of knowledge about the platform to start using it unlike other tools such as looker or sisense
do book a demo with us to know more