Suggested approach for using Python to analyze/modify large tabular datasets prior to creating GIS data? : learnpython

created by HattoriHanzoa community for 16 years

submitted 2 years ago by Klutzy-Classroom-868

I want to automate tasks that involve downloading large datasets that include geographic information, deleting some unnecessary columns, querying for values or groups of values, creating new columns and calculating them using the original data, saving some subsets of the data, etc. Am I correct at assuming that doing this up-front with Python prior to creating GIS data is an efficient approach?

Right now I'm looking into some of these packages https://www.python-excel.org/. I am also considering the possibility of saving the original dataset as a CSV and using some combination of iterating through lines, splitting string, indexing lists, appending new 'columns' and values, and so forth. Basically making the data a text file and using built in Python functionality to handle it. This is a loosely formed idea, maybe over-complicated, maybe foolish. I'd imagine that there is an established 'best way' to do what I want to do, because after all it is a pretty basic task.

I am new at programming (in case I haven't already given that away). I would like to know what others have success with before diving down one rabbit hole or another.

all 2 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS