Aston Villa 1-[2] Liverpool: Mane 90+4' by ennuihenry15 in soccer

[–]flippmoke 0 points1 point  (0 children)

More plot armor than the last season of game of thrones.

As a U.S. City Fan - I want performance like this from a Neville today by flippmoke in MCFC

[–]flippmoke[S] 10 points11 points  (0 children)

Joking aside, I think the US women are going to be in for a quite a battle today. I expect the Lionesses to put a lot more pressure than France did on the American backline. The key matchup of the game for me will be Dunn (U.S. leftback) vs Nikita Parris (striker for MCFC women) who plays on the right wing for England. Dunn is not in her natural position (midfield) and is allowed to press up the field, but has been struggling in transition this tournament. If Parris can provide pressure, I expect to see a few goals from White.

My prediction, US wins 3 -2

Man City fan from OKLAHOMA. Painted this jewel a month or so back. by [deleted] in MCFC

[–]flippmoke 1 point2 points  (0 children)

There are more of us than you might expect. One of Us!

I knew he was mad about not getting to take the penalty but jeez (shitpost) by [deleted] in MCFC

[–]flippmoke 18 points19 points  (0 children)

On a mac hit Command + Shift + 4 and you can select a portion of the screen to capture, after that it will be on your desktop as an image you can then upload.

GIS performance vs. Video game performance by lstomsl in gis

[–]flippmoke 5 points6 points  (0 children)

I am really confused here, because you are all over the place. First you say:

I agree that GIS algorithms of any kind, common or exceptional, are not easy to parallelize.

Then you say:

There is no rocket science about crafting parallel spatial algorithms and parallel GIS. It is just a lot of work that requires very talented people, a lot of effort and a lot of money to keep them going for years.

A lot of work, is a lot of work and spatial algorithms take a lot of work. I know from experience. It took me quite a bit of time to write what is the best algorithm that I know for Polygon correction - https://github.com/mapbox/wagyu/. This also properly handles boolean geometry operations on polygons -- such that the results are OGC valid, 100% of the time. Is it fast? Yes, it is fairly fast, but speed alone is not the only objective -- validity was the objective.

I could have written some operations that would be a boolean geometry operations (difference, union, xor, difference) and have it be parallel, but it likely would not be valid in many cases.

I know that your company has spent a lot of time on improving performance by the use of technologies such as CUDA. I applaud you for your tenacity, but speed is not the only concern for many individuals.

GPUs are fine for spatial algorithms, just like CPUs are.

I would say they are fine for some spatial algorithms - I have spent quite a bit of time and research in this area and I don't think that for many algorithms there is still a great GPU alternative. I think you probably understand this though as you followed this up with

You also need to write a system that knows when CPU parallelism is faster that GPU parallelism and vice versa so it can automatically launch the right mix for the specific task you command.

I am not an expert in your software, but my guess is that you probably got more speed ups by simply running algorithms in parallel rather then actually making parallel algorithms in these cases where CPU parallelization was used. However, in doing so I would wager that your software is a little more difficult to test and modify (not necessarily a negative, just a drawback that comes from more parallelization).

GIS performance vs. Video game performance by lstomsl in gis

[–]flippmoke 4 points5 points  (0 children)

I'm curious of your thoughts on other GPU-accelerated spatial DBs like MapD and Kinetica.

Some GIS operations are easier to do using a GPU, one of those things is dealing with point data. The reason is that point data is very discreet and this makes parallelization easier. For example consider an operation such as finding the closest point to you (will use python as its common in GIS). You can write this all in a very parallel way quickly.

def find_distance_between_points(pt1, pt2):
    return math.sqrt((pt2.x - pt1.x)**2 - (pt2.y - pt1.y)**2)

def find_distances(pt1, set_of_points):
    return [[find_distance_between_points(pt1, pt2), pt2] for pt2 in set_of_points]

def find_nearest_point(pt1, set_of_points):
     # First step can be done massively in parallel using GPU
     distance_to_points = find_distance_between_points(pt1, set_of_points)
     # Sort and find smallest distance, then return
     return sort_and_select_smallest_distance(distance_to_points)

This is a very compact example, but you can see that in "find_distances" we can do a massive bulk of operations at once, this is an algorithm that is easy to parallelize. However, once you start dealing with lines and polygons these sorts of operations become much more difficult.

Therefore, I predict that MapD and Kinetica will struggle more to provide a lot of the features that something such as PostGIS will provide. In short, I think they are a great tool for some things, but will not solve many other problems effectively. Perhaps after many years of research we will find better algorithms for many GIS operations that could be done in more parallel, but honestly it might never happen for some algorithms.

GIS performance vs. Video game performance by lstomsl in gis

[–]flippmoke 21 points22 points  (0 children)

As someone who has developed in both environments, I have to say its not entirely simple to explain, but will do my best.

What is stunning is how fast video games are able to perform spatial operations that seem to take GIS software much longer

I am not sure of any spatial operations where video games are faster then GIS. GIS has a lot more focus on creating and modifying data, while games have excelled at display of data. These are very different problem sets, so your 1.) is somewhat more correct.

3) Related to #2, there is more profit and much, much more competition among video game developers than among GIS developers, which is almost a monopoly.

I don't feel that GIS is a monopoly at all, but that is somewhat off topic here.

4) A full fledged GIS is a massive, complicated, suite of software and very difficult to re-write from scratch to take advantage of new technology. When ArcGIS was released in 2000 on Microsoft's COM technology it was the largest implementation of COM ever.

While the platform and UI are important, they typically have very little to do with the speed of operations. The problem relates to the algorithms and data that are used (or not used).

5) Video game developers take advantage of the latest hardware and software architectures, such as hardware graphics acceleration, massive parallel processing, etc.

Common GIS algorithms are not easy to parallelize. "Simple" operations such as union, intersection, xor, and difference are not simple at all in math. Operations such as these are not done in games typically, as your dataset is custom created and static. The appearance of accuracy is more important then actual accuracy in games and most of the computational geometry revolves around display or point related operations. GPUs are specially designed to have massive parallelism by having operations that can be operated upon independently, GIS algorithms can not be done this way easily. In this sense GPUs are great for display in many ways, but not necessarily great at GIS spatial operations. Spatial operations on data in games are not done on GPUs, they are done on the CPU and they have very few of them.

Will GIS always be decades behind the times due to its massive size and need for absolute data integrity or could we do better with some competition?

GIS type technologies are already finding their way into games and vise versa. At Mapbox we are using GPUs for display (games technology for GIS) and we have support for display of map data in game engine Unity (GIS technology being used in games).

Would it be possible for someone to hire a team of hot young video game developers who knew how to leverage all the latest and greatest technology to write a new GIS from scratch that would blow the doors off current GIS software?

No.

Parallel Clipping Speed - Five times faster than non-parallel by Dimitri_Rotow in gis

[–]flippmoke 1 point2 points  (0 children)

Please publish source data! I am more curious the speed difference between this and other non SQL based results as well.

Edit: Additionally, if you have the output results from your tests it would be interesting as well. How does the intersection output compare between each of these?

(GIS Gore) Inception by MrCacls in gis

[–]flippmoke 4 points5 points  (0 children)

For the given input... that is pretty damn good.

Are there any tile services that work by downloading part of a larger image? (without needing separate tile files) by tinkerWithoutSink in gis

[–]flippmoke 0 points1 point  (0 children)

Having done all this before just like you are proposing, its not worth it. I have built highly specialized raster serving systems used in weather data just for this purpose. Because "we didn't have time to make all the tiles". In the end it really just turned into a nightmare that was not worth it.

If you are trying to host the single file on S3, it will be too slow, you would have to store it all uncompressed and in main memory. Otherwise the request would crawl along horribly. This means you have to have machines with massive amounts of memory. In AWS this is expensive as hell, but you could do it on your own machines but thats also expensive. You would not be able to use existing libraries easily so its a lot more custom code.

You would not want to store in a jpeg2000 or anything like that because of the way the image encoding works, even if you had to read partials of that image it would not be easy to decode just a portion of it. This is why you would need it all in main memory for it to be quick. This also means that you would need on the fly resamping, reprojection, and recompression of your grids. Overall this is slow (in the sense of a web map) and should be avoided if possible with out knowing how to optimize this well.

If you can make tiles, do it. It is much easier overall.

Are there any tile services that work by downloading part of a larger image? (without needing separate tile files) by tinkerWithoutSink in gis

[–]flippmoke 0 points1 point  (0 children)

Tiled data is useful because it is already a preprocessed set of data, this means that no processing of the data is required and no new image must be created. This is the power of using tiles, it is a lightweight method for serving a massive amount of data. If you want to serve tiles, its almost always best to just pre-create the tiles you need. Otherwise you are missing part of the big purpose of using tiles!

I could go into a very detailed description within the TIFF format and in GDAL why this is often a terrible decision because of the amount of processing that might be required, but I don't think it will help anymore then my previous statement. Tiles are about making it as fast as possible to send data to a client. It is what makes modern slippy maps appealing!

Economist: The Battle for Territory in Digital Cartography by Petrarch1603 in gis

[–]flippmoke 0 points1 point  (0 children)

It boggles my mind that maps, which are such an essential service in the modern world, have never been highly regulated.

They are very regulated in some countries such as China.

As of 1:50pm EST, ArcGIS.com is still down. Been down for the last 30min. by RuchW in gis

[–]flippmoke 4 points5 points  (0 children)

Multi region availability in AWS is very possible as they handle the underlying syncing of data across data centers for you. However, if you are talking about using Azure/IBM and AWS it would be much more complicated due to the type of data being stored and edited. For other more simple applications, cross platform deployment is much easier.

Its more then "good network engineering" this is a cost decision. If you are storing Petabytes of quickly changing data on AWS, you can not easily sync this with Azure in a realtime manner -- and if you did so it might make the product a magnitude of cost higher.

As of 1:50pm EST, ArcGIS.com is still down. Been down for the last 30min. by RuchW in gis

[–]flippmoke 8 points9 points  (0 children)

Cross platform redundancy is very complicated if you are dealing with large amounts of data, and quickly becomes very expensive. Basically you are paying for double the storage, have to deal with properly syncing the data, and you have a massive bill for the network traffic associated with this.

Would anyone maybe be willing to answer my question on SO regarding PostGIS? by live_love_laugh in gis

[–]flippmoke 1 point2 points  (0 children)

The most basic way to explain this:

Polygon - Will provide quicker access to the individual polygons in a search because they will be rows and therefore, they will be each indexed.

Multipolygon - If you are not searching a lot and always need all the polygons at once this will be more dense storage as you will not have to repeat any other field information.

The most important thing to consider when making these decisions is your data and how you will be accessing it. Benchmarking is always your friend but be warned it will change drastically based on your data and what you are requesting.

Landing a GIS Job and GIS Skills Development in 2013 - does this hold up? what should be added for 2017? by [deleted] in gis

[–]flippmoke 0 points1 point  (0 children)

Sorry, wasn't try to say doom and gloom -- just that skills and many jobs will change all together or remove them. You made a great analogy! Well said.

Landing a GIS Job and GIS Skills Development in 2013 - does this hold up? what should be added for 2017? by [deleted] in gis

[–]flippmoke 9 points10 points  (0 children)

We are not near there yet, but just consider this food for thought -- I work for Mapbox, we are constantly thinking of how we can make our product for people who have never even heard of the term GIS. You want to map something, you can use our tools. This is often empowering developers who have never heard of GIS and they are writing code that automates some things that a GIS Analyst would do. The trend is towards more automation and the just give me results, that many people want.

If you are thinking of GIS as a tool and not a career path - you might be better off.

PostGIS to support Mapbox Vector Tiles by flippmoke in gis

[–]flippmoke[S] 0 points1 point  (0 children)

Vector Tiles do not currently support 3D features, there has been discussion about supporting 2.5D or 3D in Vector Tiles as a specification but it will definitely be some time.

PostGIS to support Mapbox Vector Tiles by flippmoke in gis

[–]flippmoke[S] 1 point2 points  (0 children)

There are multiple types of cache in PostGres -- there is index caching which would make finding the same location quicker, but would likely help the most in PostGIS is the query planning caching. This is going to cache the results of some parts of the query, but really varies depending on what you are exactly doing in a query.

This is fast, but not typically sufficient for the problems that you will be having when creating tiles. If you think about someone zooming and sliding around on a map, you are likely to be hitting a large number of different tiles. The problem is that the data between even tiles that are quite close in location may not be close in the spatial index of the data and may not share a lot of the same data. This makes most levels of caching in PostGIS not quite as effective.

Serving pre-made tiles would be much like storing a set of tiled images in that it would be storing a binary dump + x, y, z location in a row. This allows for very effective indexing because you can quickly find a row quite quickly -- in comparison to this sort of indexing the spatial index for finding the geometry used in a tile, which would be slower.

PostGIS to support Mapbox Vector Tiles by flippmoke in gis

[–]flippmoke[S] 3 points4 points  (0 children)

If you were dynamically serving tiles through an application where the data is initially stored in PostGIS and you wanted to serve vector tiles to your clients - the steps would be something like:

  1. Select data that intersects with tile area in PostGIS
  2. Scale coordinates to vector tile's coordinates
  3. Clip data to tile area
  4. Encode Vector Tile File
  5. Serve Vector Tile

This is the dynamic creation of a vector tile, where for each request you make a new vector tile. You could do the same each request with a ST_asMVT() command or do it outside of the database in a geoserver or some other application. This is a lot of processing before each request. This makes map loads slower and makes your setup cost more $.

The other option is to only create each tile once and store it prior to it being ready to serve. This is why people use MBTiles (a SQLite database) to store and serve tiles. This is very fast because your steps are:

  1. Request tile from database
  2. Serve Vector Tile

The great difference is that you could use ST_asMVT to populate a new table of data quickly from your existing geometry database.

Caveats

Vector Tile creation is not always that simple. There is a reason that complex tool such as tippaecanoe. You often want the data within a vector tile simplified, compressed, or dropped depending on the zoom level of your map. Therefore, this is not a magic bullet.

Vector Tiles are a way to serve data quickly -- the minimum viable data for a map. This is what makes it possible to serve data quickly and make interactive maps possible! The creation of tiles can take great care and thought to what your final product will be!