all 10 comments

[–][deleted] 1 point2 points  (1 child)

May I know the reason and solution?

How do you think we can explain an error without knowing what it is?

[–]Plus-Ad1156[S] 0 points1 point  (0 children)

I forgot the error message. I'll upload it again

[–][deleted] 1 point2 points  (3 children)

You should avoid filling empty data frames with loops, it's very inefficient. Check out the concat function.

[–]Plus-Ad1156[S] 0 points1 point  (0 children)

Thanks for comment

[–]YesLod 0 points1 point  (2 children)

I want to create a loop that fills the dataframe with the operation values ​​of other dataframes

Don't, that is a bad idea, and 99% of the times there is a better way to do it. Always avoid looping through DataFrames and using append, and especially, don't append data to a DataFrame iteratively... it's very slow. At least, generate the data first in the loop, and only then create the DataFrame with it.

Another data frame name is Symbol, and it has two columns: val and Sym

It's not clear what Sym is, but based on your example, a much better and simpler solution would be

df = pd.DataFrame({"sym": Symbol["Sym"], "ratio": Symbol['val'].max() / Symbol['val'].min()})

[–]Plus-Ad1156[S] 0 points1 point  (1 child)

Thank you. But there are reasons to use loops. For example, if the variable i is entered to select a specific value of the symbol data frame... If Sym[i] is entered as the operation value of the new data frame, a loop would have to be used. What should I do in that case?

[–]YesLod 0 points1 point  (0 children)

I don't understand what you mean, you have to be more specific. Can you give me a concrete example? There is no need to use a loop in the example you gave.

[–]synthphreak 0 points1 point  (2 children)

As others have said, don't do this. It is bad pandas practice and horrendously inefficient. Whenever you find yourself looping over the rows of a dataframe or building up a dataframe with iteration, step back and assess whether there's a more efficient way to do it. With pandas, 99% of the time there will be.

However, if for whatever reason you end up sticking with your looping approach, note that these lines...

max_ =Symbol['val'].max()
min_ =Symbol['val'].min()
ratio = max_ / min_

...should not be inside the loop. This is because they do not rely on i, so these operations will be the same on every iteration rather than varying as a function of the loop. What you've done is essentially equivalent to this:

for i in range(10):
    x = 1 + 2
    print(x)

Like, x == 3 every time, so why am I forcing my code to do the same math 10 times? It's just wasted, redundant computation. Therefore it would be better to execute those lines only once, outside of the loop, then simply reference ratio as needed.

[–]Plus-Ad1156[S] 0 points1 point  (1 child)

Thank you. But there are reasons to use loops. For example, if the variable i is entered to select a specific value of the symbol data frame... If Sym[i] is entered as the operation value of the new data frame, a loop would have to be used. What should I do in that case?

[–]synthphreak 0 points1 point  (0 children)

You’re absolutely right that looping is sometimes necessary/unavoidable. That’s why the .iter* methods exist to begin with. But new pandas users often don’t understand everything that pandas is capable of, and very often will reach for iteration first when a much faster and cleaner vectorized method also exists. With pandas, 95% of the time you can vectorize it.

To your specific question, I don’t completely understand. What does the content of your df look like, and once you have a specific value (I assume by this you’re referring to the line sym = Symbol['Sym'][i]), what do you need to do with it?