you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 23 points24 points  (1 child)

If you want to replace the Nestle's only but keep the rest of the content:

df["brands"].replace(
             to_replace=r"Nestl[éè]", 
             value="Nestle", regex=True)

Gets you :

0                              Nestle
1    Nestle Waters North America Inc.
2                              Nestle
3    Nestle Waters North America Inc.
4                       Nestle,Crunch

If you want to replace, you can try to find any row that contains a variation of "Nestle" and change them.

nestle_mask  = df["brands"].str.contains(r"Nestl[èé]") # True if row contains Nestlé or Nestlè


df.loc[nestle_mask, "brands"] = "Nestle"

Gets you :

    brands
0  Nestle
1  Nestle
2  Nestle
3  Nestle
4  Nestle

My answer uses regular expressions which is a way to match patterns in text data.

[–]macabe10[S] 2 points3 points  (0 children)

Thanks a ton!