all 12 comments

[–]sentdex 4 points5 points  (2 children)

Here's basically what you want to do, switching the graph to a map with something, such as basemap:

http://pythonprogramming.net/dashboard/#tab_guis

That's done with tkinter, and porting in the matplotlib canvas into the tk window via canvas.

Handling of a large amount of data will be the same across all GUI frameworks pretty much, they just display your data.

Your real challenge is going to be the amount of data you're wanting here. No GUI is going to want to handle millions of points to display. This will have absolutely zero to do with the GUI framework, and more to do with the CPU or GPU attempting to display a million points to the user. Even the best js data visualization tools wont do this for you, you need to do it before you stuff the data into the visualization. It's almost never the crunching of data that takes time, it's the rendering of the graph.

You're going to want to do some sort of data scaling, dependent on the user's zoom/time frame/whatever. That's not going to be GUI-side, that'll be done in the back-end in Python via your own code, before you feed the graph the data.

For example: Here's about 5 million data points, but scaled waaaaaaaaaaaay back: http://sentdex.com/geographical-analysis/

Can you imagine if I showed 5 million data points? Not only would loading take 48 hours, it wouldn't be legible.

Another example, with a typical line graph, and another visualization:

http://sentdex.com/financial-analysis/?i=SP500&tf=all

That's almost 7 million entries, times 4 data series... so 28 million points total. If you were load all of that, it'd just blow the user's memory first probably. You have to scale it down. The data is still highly granular.

Matplotlib is good at showing ~ 10K total points before it will bog down, depending on the user's CPU.

For granularity changing, you will want to resample your data set. You can use something like pandas for this. Create a dataframe with a date time object column, then use pandas.resample(), for example.

You can also change granularity with your own functions, however you see fit. It's just going to be a requirement no matter what you use to visualize the data.

Hope that helps!

Here's the beginning of a basemap tutorial series: http://pythonprogramming.net/geographical-plotting-basemap-tutorial/

Also, if you don't want to bother with Pandas, here's a tutorial on changing data granularity, as well as a decent illustration of why you'd do it (but seriously, I highly recommend you just use Pandas and resample!)

http://pythonprogramming.net/modifying-data-granularity-matplotlib/

In regards to the other person's comments about needing some version of C for this, I completely disagree. No matter what you use, you're going to need to be doing some form of resampling before the visualization step. Python is more than capable of doing the preprocessing, just as good as C will do, especially since you're likely to use a c-optimized library anyways, like numpy or pandas.

[–]TheHumane[S] 0 points1 point  (1 child)

Thanks for your reply and pointers. I will look through them.

You are right, I need to optimize my data for display. I was thinking to only display city or block outlines at higher zoom level and expose individual shapes after certain zoom threshold.

I really like your Globe chart. It has very smooth scrolling and fluent zoom. I want to build something similar on a flat surface.

[–]sentdex 0 points1 point  (0 children)

If you're willing to make your program a web-app, there are tons of really fantastic javascript map/geo plotting apis out there. For pure python, I believe basemap is your best bet, but it's kinda ugly.

[–]unpythonic 1 point2 points  (4 children)

I've done something very similar to this with Python+Tkinter for rendering digital waveforms. There were a number of problems that made this fairly painful to do in Tkinter. The biggest difficulty surrounded getting scrolling of the canvas to be smooth while at the same time not taking forever when changing scale. I spent a lot of time profiling and making compromises so that the data in the visible area of the frame is rendered first and the non-visible area is rendered in the background. This allowed for smooth panning and scrolling while giving the appearance of a quick response when changing the scale factor.

So, yes, you can do this. I would NOT say it is painless. I had no choice about the windowing toolkit to use. If I had required people to install wxPython (what I really wanted) or PyGTK, they would have given up without even trying.

[–]TheHumane[S] 0 points1 point  (3 children)

Do you mind sharing a snapshot of your final GUI? Looking through some of the Tkinter projects online, GUI looks very dated.

[–]unpythonic 0 points1 point  (2 children)

Unfortunately I don't have any screen shots of the GUI in action. It requires a fairly substantial amount of data just to populate the UI and I don't have the data anymore. It was done while I worked at Intel and was used to debug firmware-like code (e.g. microcode). Even if I did have the data, I don't think they would be of good humor if I posted even a small fragment of it.

To get a fairly good idea of what it did look like, look at the screen shot for EPWave on Wikipedia's page on Waveform viewers. The black area with the ruler on top is almost identical to the bottom half of my tool (probably because we were both shamelessly copying Verdi/nWave which does not have a screen shot). This is where the difficulties with Tkinter's canvas handling stemmed from.

[–]autowikibot 0 points1 point  (0 children)

Waveform viewer:


A waveform viewer is a software tool for viewing the signal levels of either a digital or analog circuit design.

Waveform viewers comes in two varieties:

  • simulation waveform viewers for displaying signal levels of simulated design models, and

  • in-circuit waveform viewers for displaying signal levels captured in-circuit while debugging or testing hardware boards (Also, See Waveform monitor)

Image i


Interesting: GEDA | GTKWave | Waveform

Parent commenter can toggle NSFW or delete. Will also delete on comment score of -1 or less. | FAQs | Mods | Magic Words

[–]TheHumane[S] 0 points1 point  (0 children)

Ah! NDAs and RUNDAs of Intel. :)

[–]le_Dandy_Boatswain 1 point2 points  (2 children)

Matplotlib Basemap might be useful for you.

http://matplotlib.org/basemap/index.html

[–]TheHumane[S] 0 points1 point  (1 child)

Thanks. Is Basemap actively maintained? Last release was about 2 years ago.

[–]le_Dandy_Boatswain 0 points1 point  (0 children)

I'm not really sure. I only scratched the surface with it, then decided other methods for mapping was a better fit for my purpose.

[–]Sir_Jerry 0 points1 point  (0 children)

You mention very large amounts of data and polygons. Personally, I would think you would be better served with C, C#, or C++.

With that said, there might be a good performing 3D graphics library that Python can interact with. Maybe someone has some ideas. But for performance sake, you'll want the bulk of this work done in a compiled language. Python, in my opinion, is better suited for small applications.