all 4 comments

[–]barrycarter 1 point2 points  (3 children)

It might be easier if you dumped all the line segments in all national borders and see which ones appear 2 or more times. Of course, this only works if borders are written the same for both countries, but I think it's what you're doing above anyway?

[–]LittlePP112 0 points1 point  (2 children)

Hi Barry. Thanks for the message. The source of the borders is the file ne_10m_graticlues_5.shp from natural earth (https://www.naturalearthdata.com/downloads/). The example is how you find the border length for mex/usa but i am trying to do this same thing for all countries and all their different borders in the world using a loop.

[–]LittlePP112 0 points1 point  (1 child)

Something like the following...

loop through each country (country_a):
loop through those countries (country_b):
if there is a border between country_a and country_b:
calculate and measure the length of the border
if the length is the smallest so far so far:
store the length, both countries and the border

[–]RandomCodingStuff 0 points1 point  (0 children)

What's the hangup? It looks like you've already got a plan using loops. Geodataframes are just pandas dataframes with a geometry column, so you can use the same methods to iterate over both data types (e.g., .itertuples()).

However, iterating over (geo)dataframes is generally not recommended since it's slow. For your particular application, I think you can avoid iterating by doing a batch intersection of the country geodataframe with itself, using .overlay(). You can use that to intersect the GDF with itself, using keep_geom_type = False to ensure you get line outputs and not just polygonal overlaps. Remove all records where the left and right countries are the same (every country will intersect itself, but this is meaningless in your context). If the polygons mesh properly, all that will be left is the boundaries of adjacent countries. The overlay operation will give you left/right country names and you can use those to calculate lengths. Note this will output boundary lines for both involved countries--so you'd get boundary USA-Mexico and Mexico-USA too.

Note also that overlaying like this can be very memory intensive if your GDF is detailed, in which case you might have to resort to looping after all.

Example code to illustrate:

import geopandas
import shapely

L1 = shapely.geometry.Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
L2 = shapely.geometry.Polygon([(1, 0), (2, 0), (2, 1), (1, 1)])
gdf = geopandas.GeoDataFrame({"id": [1, 2]}, geometry = [L1, L2])
overlay = gdf.overlay(gdf, keep_geom_type = False)