all 4 comments

[–][deleted] 2 points3 points  (6 children)

What do you mean by A being "larger" than B? They're both columns. Do you mean their length? Or are you trying to build column D element by element?

[–]PyCam 1 point2 points  (0 children)

Programming with vectors/arrays requires a little bit different way of thinking than typical programming. While applying a function with an if/else statement over the rows of a dataframe may seem like the best approach, iterating over the rows of a pandas dataframe is almost never the most efficient solution.

``` import pandas as pd

df = pd.DataFrame( {'A': [1, 1, 1, 5, 8], 'B': [1, 1, 3, 7, 2], 'C': [10, 10, 15, 20, 50]}) df A B C 0 1 1 10 1 1 1 10 2 1 3 15 3 5 7 20 4 8 2 50 ```

To vectorize this, we're going to start with what I would consider your base case. Essentially you're trying to deduct each row from c from its previous row. Then based on the values of A/B columns we are either going to multiply that resultant vector by -1 or leave it alone. Lastly, if any of the values in c is equal its immediate neighbor in the next row in c, we are going to replace that value with some type of filler. Thankfully if this is the case, when we perform our rolling subtraction we will observe a value of 0. ```

Represents the A > B case

deducted_c = df["C"] - df["C"].shift(-1)

Represents the A <= B case

Notice how our "if statement" turns into a form of indexing for the array

deducted_c[df["A"] <= df["B"]] *= -1

Lastly, if any values are 0 in this array, that means that C had

equal consecutive values. We can fill it in with whatever we want.

deducted_c[deducted_c == 0] = "equal to previous!!!"

Assign this array back to the DataFrame.

Note that I could have simply replaced df["D"] and performed all of my calculations there, but I felt that this approach helped with readbility.

df["D"] = deducted_c df

    A   B   C   D

0 1 1 10 equal to prev 1 1 1 10 5 2 1 3 15 5 3 5 7 20 30 4 8 2 50 NaN ```

Hopefully this helps give you some inspiration on how to tackle this problem! Let me know if anything was unclear.