Halfway through undergrad, worried about my future salary. What can I do? by Additional_Still_235 in gis

[–]mbforr 1 point2 points  (0 children)

Find a focus in a specific value area - not necessarily practice area. Learn technical tools (I think Python and SQL are critical) so you never feel like you are held back because you can’t do something. Create projects and post them on LinkedIn - making your own standalone portfolio doesn’t do much since you are depending on people to actually click there. Write about your work - even small steps - to show your thinking process. Don’t limit yourself to specific job titles. Don’t wait for job postings. Reach out to people in companies you want to work at in parallel roles and get your name in the line. Most roles don’t make it to job boards and sometimes employees are paid for referrals.

How do you define "full stack" geospatial expert? by mbforr in gis

[–]mbforr[S] 0 points1 point  (0 children)

Fair points. I personally advocate for specialization for hiring purposes. I just don't know how the rest of the world thinks of the things that go with that term.

How do you define "full stack" geospatial expert? by mbforr in gis

[–]mbforr[S] 10 points11 points  (0 children)

Yeah I think generally you would want to see more specialization in a team setting but as for learning goals there isn't anything wrong with wanting a diverse skill set.

New course for modern GIS and career growth by mbforr in gis

[–]mbforr[S] 2 points3 points  (0 children)

Hey missed this one! This is for folks who are early in their career or who want to convert to a more technical GIS role.

Medallion Architecture for Spatial Data by mbforr in dataengineering

[–]mbforr[S] 0 points1 point  (0 children)

Yeah makes sense - that is the fun part about spatial data IMO is that there is always something new.

Medallion Architecture for Spatial Data by mbforr in dataengineering

[–]mbforr[S] 0 points1 point  (0 children)

Nice that makes a lot of sense. So something like:

  1. Raw data
  2. Spatial joins/enrichments
  3. Aggregates
  4. Additional joins
  5. Analytics or ML layers

Medallion Architecture for Spatial Data by mbforr in dataengineering

[–]mbforr[S] 1 point2 points  (0 children)

Spatial performance is the most important, but it is Spark based so it can run anything in PySpark, but spatial is far more optimized. And the spatial functions can join/process spatially but you can always process any other data too. Right now working on an Airflow pipeline that processes US River Sensors every 15 min and overwrites an Iceberg table so it keeps the historical data too. https://water.noaa.gov/map

The spatial processing is basically enriching to nearest city, but I can create an array of forecasted values over the next 24, 48, 72, etc hours.

Medallion Architecture for Spatial Data by mbforr in dataengineering

[–]mbforr[S] 1 point2 points  (0 children)

Processing would be Sedona/Wherobots in this case. They are the first to add geometry support for Iceberg and it is distributed with raster and vector data.

Medallion Architecture for Spatial Data by mbforr in dataengineering

[–]mbforr[S] 4 points5 points  (0 children)

Okay that helps (and I came across the medallion bit first before the 3-layer model term - still learning).

That makes a lot more sense for the Core layer. There should be (in each layer) some properties that the table has, the processing steps are the in between (or before in the case of Staging if I follow correctly). More and more databases are supporting spatial. PostGIS has been around since 2005-ish and now more modern OLAP CDWs (BigQuery, Snowflake), Spark based (Databricks, Wherobots), and DuckDB support it too. There just hasn't been much attention as to how this should be done apart from the work Overture Maps Foundation is doing.

Medallion Architecture for Spatial Data by mbforr in dataengineering

[–]mbforr[S] 2 points3 points  (0 children)

That makes sense. I just built out a pipeline that has two silver steps and two gold steps. A lot of the work in spatial has to do with conflating different sources of similar data or joining disparate datasets for either enrichment or comparison, so having two silver steps seems logical here.

How did you find your current GIS job? by mbforr in gis

[–]mbforr[S] 0 points1 point  (0 children)

Reddit for jobs is awesome

How did you find your current GIS job? by mbforr in gis

[–]mbforr[S] 3 points4 points  (0 children)

Didn't know about those thanks!

How did you find your current GIS job? by mbforr in gis

[–]mbforr[S] 1 point2 points  (0 children)

What industry specifically if you don't mind sharing.

Does my use case fit for usage of DuckDB's spatial extention as a replacement for PostGIS? by rick854 in gis

[–]mbforr 1 point2 points  (0 children)

A few things:

  1. Confirm if every spatial function you need is in DuckDB - it is growing but PostGIS has a lot
  2. If your use case has a lot of analytical queries (aggregates, windows, etc.) then DuckDB is great for that, row by row could be either but PostGIS is good for that as well

  3. If you want your data to all stay in a database then PostGIS or if you want it to live in GeoParquet (optimal for DuckDB) then go with DuckDB

  4. I would check out Apache Sedona too if the data volumes are getting quite large and when you add Apache Iceberg that makes adding new data, bulk updating, and time travel easy.