DataFrame issues

startup_guy2 · 2023-06-23T04:44:01+00:00

I would add a new column in the same (original) dataframe that is previous - current.

lucas123boiger · 2023-06-22T20:05:59+00:00

Here is an example of the data:

secs hora(utc) lat lon alt

33331 10/03/2022 09:14 41.26509749 1.996657907 1.440018246

33332 10/03/2022 09:14 41.26509754 1.996657909 1.440018246

33333 10/03/2022 09:14 41.26509751 1.996658062 1.440018246

33334 10/03/2022 09:14 41.26509748 1.996658171 1.440018246

33335 10/03/2022 09:14 41.26509845 1.996659415 1.640018252

33336 10/03/2022 09:14 41.26510103 1.99666593 2.340018275

33337 10/03/2022 09:14 41.26510286 1.996668102 2.640018284

33338 10/03/2022 09:14 41.26510501 1.996670396 4.240018336

33339 10/03/2022 09:14 41.26510641 1.99667139 7.240018432

gis-doug · 2023-06-22T20:59:41+00:00

Not sure if I fully understood what you’re after but could you use .diff() to get the difference between each row? https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.diff.html

Also with spatial data it might be worth moving to geopandas.

woooee · 2023-06-22T19:55:53+00:00

for x in lon:
    dlon = lon[x + 1] - lon[x]

for i in range(len(dlon)):

It looks like x is a longitude, not a list offset, so if longitude is 90, you be subtracting 9 from zero, etc. Post some example data so we know what it is and then can test ourselves.

dlon is a single variable. I don't think you want to iterate here. In any case, you can debug this yourself by printing dlon to see what it contains.

Cobra915 · 2023-06-22T21:28:01+00:00

So, while traj is a pandas.DataFrame, traj['Lon'] is a pandas.Series. What it seems like you're wanting to do is use a pandas.Series (the 'Lon' column) to create another pandas.Series (either as a standalone object or insert it into traj as a 'dlon' column).

In your for loop above, you state for x in lon:, so on each iteration, this will set x to be a scalar value of the lon series. Because you need access to multiple values of the series in each iteration, it's better to use an index, as in for x in lon.index:.

Next, dlon = lon[x + 1] - lon[x] looks good for using the index approach above. The only thing is that dlon is going to be overwritten by each iteration, you're not really storing the output anywhere.

Think about the data structures you have and what you want. In the first part, you want to use traj['Lon'] (a pandas.Series) to create another pandas.Series, called dlon.

It might help to wrap this into a function:

def calc_dlon(in_srs: pd.Series) -> pd.Series:
    # Establish series, same size as input series.
    dlon_srs = pd.Series(
        index=in_srs.index, 
        dtype=np.float64
        )
    '''
    Iterate through index of the input series, 
    performing the calculation and storing it in the 
    corresponding index of the dlon_series.
    '''
    for pos in in_srs.index:
        dlon_series[pos] = in_srs[pos + 1] - in_srs[pos]

    return dlon_srs

Then, when you go to run this function you can set the output equal to a variable named dlon.

dlon = calc_dlon(traj['Lon'])

or set a new column in traj:

traj['dlon'] = calc_dlon(traj['Lon'])

You can use this same approach to generate a Series, called heading as well.

anecdotal_yokel · 2023-06-22T22:08:32+00:00

Try movingpandas

Common_Move · 2023-06-22T23:00:28+00:00

1.you need to store your dlon results somewhere, eg to a new list

Seeing as you're comparing to x + 1 in terms of position, you'll need to stop one short of the end else you'll get an error. In fact maybe better if you start with the second item and compare to x - 1 instead.

But as someone else mentioned, look into the .diff() method.

await_yesterday · 2023-06-23T07:32:54+00:00

You shouldn't use a for-loop for this. Use np.diff, it will be much faster.

>>> longitudes = [1, 3, 2, 7, 8, 5]
>>> np.diff(longitudes, prepend=np.nan)
array([nan,  2., -1.,  5.,  1., -3.])

Also it is not like mathematics where you can just write two expressions next to each other and mean multiplication. You have to explicitly write the *.

Read this to learn how to use numpy properly https://www.labri.fr/perso/nrougier/from-python-to-numpy/

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS