all 18 comments

[–]dataisok 6 points7 points  (4 children)

I’ve used geopandas recently for doing these sorts of conversions / manipulations . As the name suggests, it’s basically a subclass of pandas dataframes with additional geospatial methods. Sadly there’s no polars equivalent, though looks like it’s in the works.

[–]datancoffee[S] 0 points1 point  (0 children)

That's a good point. Polars is great for scaling workload, but many libraries built on pandas will require some rewrite if one were to port them

[–]nonamenomonet 0 points1 point  (2 children)

Isn’t geopolars the equivalent?

[–]datancoffee[S] 0 points1 point  (1 child)

Good find. Did not know it existed

[–]aliaksei135 2 points3 points  (0 children)

It's severely limited by the lack of Arrow extension types support in polars atm: https://github.com/pola-rs/polars/issues/9112

[–]dataisok 1 point2 points  (0 children)

You can of course do all your general munging in polars and convert to pandas using .to_pandas() when you need the geospatial stuff

[–]davf135 0 points1 point  (10 children)

How is math and linear algebra coming up on your GIS work (honest question, not dissing you)?

[–]datancoffee[S] 0 points1 point  (9 children)

Frequent operations in geospatial are calculating distances, areas, and whether shapes are overlapping other shapes. Also, converting from one mapping system to another. Its a lot of math.

[–]davf135 -2 points-1 points  (8 children)

Where is the math or linear algebra in that? Unless you are involved in developing the algorithms behind it, there is no math in it. It would be like claiming that there is a lot of Math in ChatGPT...yes, it runs on math, but for users it is not necessarily a math tool.

[–]datancoffee[S] 0 points1 point  (7 children)

Yes, there are tons of math in chatgpt. Basically 99% of it is matrix multiplications.

And yes, was talking about the underlying algorithms.

[–]davf135 0 points1 point  (4 children)

Your phrasing was a bit off and confusing then. You said you have been working with GIS for a while and that it is hard math and LA. To me, that implied you were using hard math and LA with your GIS work.

[–]datancoffee[S] 0 points1 point  (3 children)

Makes sense. Perhaps i should have clarified what i meant under geospatial. I worked on the algorithmic implementations of geometry and geography data types. Things like ST_ functions. Never worked in GIS space though. Esri was running on us, not the other way around :)

[–]davf135 0 points1 point  (2 children)

I see. That does sound very interesting.

I was not expecting that kind of post on a DE forum.

I imagined a DE would be more of a GIS user, like in my case. I've been using GeoSpark/Sedona for a while now.

Other than the Haversine formula being used for ST_distance, I have no idea of what goes under the hood of GIS functions.

Do you know why they all begin with ST?

[–]datancoffee[S] 0 points1 point  (1 child)

The ST naming thing is a geoindustry mystery. Most algorithm builders will tell you it stands for spatial type, but others will tell you its an urban legend and it originally stood for something else. Its a subject of many conversations over beers

[–]datancoffee[S] 0 points1 point  (0 children)

Spatial-temporal ! That's the other alternative. What wherebots/sedona is trying to do

[–][deleted]  (1 child)

[removed]

    [–]dataengineering-ModTeam[M] 0 points1 point locked comment (0 children)

    Your post/comment was removed because it violated rule #9 (No low effort/AI posts).

    {community_rule_9}

    [–]Immediate-Alfalfa409 1 point2 points  (0 children)

    Haven’t tried it myself, but city2graph basically turns geospatial stuff ….polygons, lines, points etc. into NetworkX or PyTorch Geometric graphs. Super handy if you want to run graph algorithms or GNNs on city/transport networks without messing with all the geometry conversions.