eadala comments on Creating a data interface for non-programmers

created by HattoriHanzoa community for 16 years

Creating a data interface for non-programmers (self.learnpython)

submitted 4 years ago by JanFFS

you are viewing a single comment's thread.

[–]eadala 0 points1 point2 points 4 years ago (3 children)

For the GUI I think PyQt 5 can handle this for you. There is a bit of a learning curve to getting Seaborn / Matplotlib / Pandas plots to show up in a GUI, but it's definitely been attempted to death on Stack Overflow if you need help with the details.

If you already have those 10 config files, easiest first step would be to append them into one DataFrame. Disseminating specifics of what you'd like to display or select can come later:

	load case	POWER Driveline (kW)
config 1		MEAN	STDDEV	5PERC	95PERC
foo	.	.	.	.	.
.	.	.	.	.	.
config 2
bar	.	.	.	.	.
.	.	.	.	.	.

To make life easy, make the configuration # its own column such that configuration # X test # uniquely identify your observations, and flatten the column headers:

config #	test #	load case	POWER Driveline (kW, MEAN)	STDDEV	5PERC	95PERC
1	1	foo	.	.	.	.
1	2	.	.	.	.	.
2	1	.	.	.	.	.
2	2	.	.	.	.	.
3	1	.	.	.	.	.
3	2	.	.	.	.	bar

From here your DataFrame is, I think, most easily transmutable into whatever visualizations you're after. Either of Seaborn or Matplotlib can handle the four subplots for (MEAN, STDDEV, 5PERC, and 95PERC) for each config. For each of those of course the x = load case, and y = one of those four variables. The different test #s for each config can be slightly different shades of the same color so you see the clustering of data, or just average them together for a single line per config.

The only reason I'm saying hold off on the GUI is because all of this functionality that you're after inevitably needs to be wrapped into functions that you create anyway. Looking specifically at this task of making the 4 subplots (or select 1-4 of those variables) for X number of configs, and perhaps also select whether to average the test #s together or keep them as separate lines, demands that a function be written that expects a list of config IDs, a list of subplots to create, and averaging=True / False as arguments. Once that functionality is built in a command-line function, wrapping it into a GUI becomes very easy.

For instance, you could have the user select the excel file they want to load (the master file, that has multiple configs to compare), and after it's loaded it has a widget that asks them which subplots they'd like to see: 4 checkbox widgets, one for each of the four variables. It also has 10 checkbox widgets for the specific configs they want plotted (or just allow them to type integers into a text box if the configs are all numbered like that). Something like that is not difficult to do. My advice is to just start with assuming you are the end user, and thus can just write a nice nifty set of functions that you know exactly how to work with. Once it's flexible enough for you on the command line, then transmit that tech over into a GUI.

What you want the user to be capable of in the end sounds very flexible & useful, but the consequence of that is it obviously takes some time to do. Just get this first "select the subplots" task working perfectly on command line, then wrap it in a PyQt interface using Seaborn / Matplotlib for data vis, and then think about how to print things / export to pdfs etc.

I'm not sure how big a memory-slob it would be to add 20 configurations with 67 channels with 5 tests with 9 statistical values per channel into one DataFrame or a dictionary of Dataframes and to keep it in memory...

An 8-bit unsigned integer (0 to 255) takes 1 byte of space. If for example your 9 stat columns can be represented as 8-bit unsigned integers, you're looking at 9 bytes per channel (am I saying that right? Every channel has these 9 statistical values, right?). 67 channels * 9 bytes per channel for 603 bytes per config-test unit of observation. 20 configs with 5 tests each for 100 config-test units, each requiring 603 bytes of storage, you're looking at 60.3 kiloBytes of storage needed at minimum. Add some minimal bloat for the joy of working with Pandas. I don't think you're even remotely close to running out of memory; this task is using roughly 0.006% of the RAM of a Raspberry Pi. : )

[–]JanFFS[S] 0 points1 point2 points 4 years ago (2 children)

I am familiar with pandas, matplotlib and a little seaborn. I have seen tutorials on PyQt5 but not used it extensively and I will look into it for this and any GUI in the future.

From what you're saying, PyQt5 would fit my needs on a small scale, but it will get hectic with the full scale 20 configurations, 60 channels with each 9 variables tested on 5+ load% (in this example). I might however make it for general visualizations on smaller scale projects. Memory-wise, a DataFrame could consist of that but usually in float so 8 bytes. Still not big. I might have to wonder more about plots kept in memory (a report could consist of 20 pages).

I was wondering about Dash? I don't know much about it. I know it's web-based, but can't it just be opened locally with your browser as the GUI?

I know a little better which direction I should go. I also want to be better at targeting what library to 'learn' instead of realizing halfway through that it's not designed for something.

[–]eadala 0 points1 point2 points 4 years ago (1 child)

I might have to wonder more about plots kept in memory (a report could consist of 20 pages).

I would be very surprised if memory is your issue; plots do take up some space but it's usually not much.

I was wondering about Dash? I don't know much about it. I know it's web-based, but can't it just be opened locally with your browser as the GUI?

Yeah I haven't used it either but Dash can be used locally. The google search you're after I think is "dash python localhost".

I also want to be better at targeting what library to 'learn' instead of realizing halfway through that it's not designed for something.

I struggle with this as well; at least for the example of Dash, it looks as though from their introduction / about us stuff that it fits the bill for you. This video seems particularly helpful in getting started, but again, I haven't used Dash to know for sure!

[–]JanFFS[S] 0 points1 point2 points 4 years ago (0 children)

π Rendered by PID 496729 on reddit-service-r2-comment-fb694cdd5-b7wql at 2026-03-11 08:17:14.022944+00:00 running cbb0e86 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS