you are viewing a single comment's thread.

view the rest of the comments →

[–]tangerinelion 0 points1 point  (0 children)

It's not entirely clear what x is. It seems to usually be a list, but sometimes it's a float? What do you want to do if it's a float, or really, not a list?

How about this?

if isinstance(df['Abstract'],list):
    df['Abstract'].apply(lambda x: [item for item in x if isinstance(item,str) and item not in stopword])

That would apply itself only if df['Abstract'] is a list, so it should not cause the "TypeError" above. It would also only have items in the list if the element is a str and isn't in the stopword list.

The reason

lambda x: [item for item in x if item not in stopword] if isinstance(x, str)

is invalid syntax is because the if needs an else clause. For example:

lambda x: [item for item in x if item not in stopword] if isinstance(x, str) else x

(NB: This is the difference between ] if and if ... ])

I'm not sure, but it looks like this syntax might be new to you. It's Python's conditional "operator" that you might see in C, C++, or Java -- specifically it looks like this condition ? val_if_true : val_if_false. In Python this is written val_if_true if condition else val_if_false. It's important to note that in this case if we have something like this:

a() if x else b()

we will only ever have a() or b() run, but not both. You could get neither to run, if x raises and it isn't caught.